OSDN Git Service

uclinux-h8/linux.git
2 years agobnxt_en: move coredump functions into dedicated file
Edwin Peer [Fri, 29 Oct 2021 07:47:48 +0000 (03:47 -0400)]
bnxt_en: move coredump functions into dedicated file

Change bnxt_get_coredump() and bnxt_get_coredump_length() to non-static
functions.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: Refactor coredump functions
Edwin Peer [Fri, 29 Oct 2021 07:47:47 +0000 (03:47 -0400)]
bnxt_en: Refactor coredump functions

The coredump functionality will be used by devlink health. Refactor
these functions that get coredump and coredump length. There is no
functional change, but the following checkpatch warnings were
addressed:

  - strscpy is preferred over strlcpy.
  - sscanf results should be checked, with an additional warning.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: improve fw diagnose devlink health messages
Edwin Peer [Fri, 29 Oct 2021 07:47:46 +0000 (03:47 -0400)]
bnxt_en: improve fw diagnose devlink health messages

Add firmware event counters as well as health state severity. In
the unhealthy state, recommend a remedy and inform the user as to
its impact.

Readability of the devlink tool's output is negatively impacted by
adding these fields to the diagnosis. The single line of text, as
rendered by devlink health diagnose, benefits from more terse
descriptions, which can be substituted without loss of clarity, even
in pretty printed JSON mode.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: consolidate fw devlink health reporters
Edwin Peer [Fri, 29 Oct 2021 07:47:45 +0000 (03:47 -0400)]
bnxt_en: consolidate fw devlink health reporters

Merge 'fw' and 'fw_fatal' health reporters.  There is no longer a need
to distinguish between firmware reporters. Only bonafide errors are
reported now and no reports were being generated for the 'fw' reporter.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: remove fw_reset devlink health reporter
Edwin Peer [Fri, 29 Oct 2021 07:47:44 +0000 (03:47 -0400)]
bnxt_en: remove fw_reset devlink health reporter

Firmware resets initiated by the user are not errors and should not
be reported via devlink. Once only unsolicited resets remain, it is no
longer sensible to maintain a separate fw_reset reporter.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: improve error recovery information messages
Edwin Peer [Fri, 29 Oct 2021 07:47:43 +0000 (03:47 -0400)]
bnxt_en: improve error recovery information messages

The recovery election messages are often mistaken for errors. Improve
the wording to clarify the meaning of these frequent and expected
events. Also, take the first step towards more inclusive language.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: add enable_remote_dev_reset devlink parameter
Edwin Peer [Fri, 29 Oct 2021 07:47:42 +0000 (03:47 -0400)]
bnxt_en: add enable_remote_dev_reset devlink parameter

The reported parameter value should not take into account the state
of remote drivers. Firmware will reject remote resets as appropriate,
thus it is not strictly necessary to check HOT_RESET_ALLOWED before
attempting to initiate a reset. But we add the check so that we can
provide more intuitive messages when reset is not permitted.

This firmware setting needs to be restored from all functions after
a firmware reset.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: implement devlink dev reload fw_activate
Edwin Peer [Fri, 29 Oct 2021 07:47:41 +0000 (03:47 -0400)]
bnxt_en: implement devlink dev reload fw_activate

Similar to reload driver_reinit, the RTNL lock is held across reload
down and up to prevent interleaving state changes.  But we need to
subsequently release the RTNL lock while waiting for firmware reset
to complete.

Also keep a statistic on fw_activate resets initiated remotely from
other functions.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: implement devlink dev reload driver_reinit
Edwin Peer [Fri, 29 Oct 2021 07:47:40 +0000 (03:47 -0400)]
bnxt_en: implement devlink dev reload driver_reinit

The RTNL lock must be held between down and up to prevent interleaving
state changes, especially since external state changes might release
and allocate different driver resource subsets that would otherwise
need to be tracked and carefully handled. If the down function fails,
then devlink will not call the corresponding up function, thus the
lock is released in the down error paths.

v2: Don't use devlink_reload_disable() and devlink_reload_enable().
Instead, check that the netdev is not in unregistered state before
proceeding with reload.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-Off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: refactor cancellation of resource reservations
Edwin Peer [Fri, 29 Oct 2021 07:47:39 +0000 (03:47 -0400)]
bnxt_en: refactor cancellation of resource reservations

Resource reservations will also need to be reset after FUNC_DRV_UNRGTR
in the following devlink driver_reinit patch. Extract this logic into a
reusable function.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agobnxt_en: refactor printing of device info
Edwin Peer [Fri, 29 Oct 2021 07:47:38 +0000 (03:47 -0400)]
bnxt_en: refactor printing of device info

The device info logged during probe will be reused by the devlink
driver_reinit code in a following patch. Extract this logic into
the new bnxt_print_device_info() function. The board index needs
to be saved in the driver context so that the board information
can be retrieved at a later time, outside of the probe function.

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'code-movement-to-br_switchdev-c'
Jakub Kicinski [Fri, 29 Oct 2021 03:06:02 +0000 (20:06 -0700)]
Merge branch 'code-movement-to-br_switchdev-c'

Vladimir Oltean says:

====================
Code movement to br_switchdev.c

This is one more refactoring patch set for the Linux bridge, where more
logic that is specific to switchdev is moved into br_switchdev.c, which
is compiled out when CONFIG_NET_SWITCHDEV is disabled.
====================

Link: https://lore.kernel.org/r/20211027162119.2496321-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: switchdev: consistent function naming
Vladimir Oltean [Wed, 27 Oct 2021 16:21:19 +0000 (19:21 +0300)]
net: bridge: switchdev: consistent function naming

Rename all recently imported functions in br_switchdev.c to start with a
br_switchdev_* prefix.

br_fdb_replay_one() -> br_switchdev_fdb_replay_one()
br_fdb_replay() -> br_switchdev_fdb_replay()
br_vlan_replay_one() -> br_switchdev_vlan_replay_one()
br_vlan_replay() -> br_switchdev_vlan_replay()
struct br_mdb_complete_info -> struct br_switchdev_mdb_complete_info
br_mdb_complete() -> br_switchdev_mdb_complete()
br_mdb_switchdev_host_port() -> br_switchdev_host_mdb_one()
br_mdb_switchdev_host() -> br_switchdev_host_mdb()
br_mdb_replay_one() -> br_switchdev_mdb_replay_one()
br_mdb_replay() -> br_switchdev_mdb_replay()
br_mdb_queue_one() -> br_switchdev_mdb_queue_one()

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: mdb: move all switchdev logic to br_switchdev.c
Vladimir Oltean [Wed, 27 Oct 2021 16:21:18 +0000 (19:21 +0300)]
net: bridge: mdb: move all switchdev logic to br_switchdev.c

The following functions:

br_mdb_complete
br_switchdev_mdb_populate
br_mdb_replay_one
br_mdb_queue_one
br_mdb_replay
br_mdb_switchdev_host_port
br_mdb_switchdev_host
br_switchdev_mdb_notify

are only accessible from code paths where CONFIG_NET_SWITCHDEV is
enabled. So move them to br_switchdev.c, in order for that code to be
compiled out if that config option is disabled.

Note that br_switchdev.c gets build regardless of whether
CONFIG_BRIDGE_IGMP_SNOOPING is enabled or not, whereas br_mdb.c only got
built when CONFIG_BRIDGE_IGMP_SNOOPING was enabled. So to preserve
correct compilation with CONFIG_BRIDGE_IGMP_SNOOPING being disabled, we
must now place an #ifdef around these functions in br_switchdev.c.
The offending bridge data structures that need this are
br->multicast_lock and br->mdb_list, these are also compiled out of
struct net_bridge when CONFIG_BRIDGE_IGMP_SNOOPING is turned off.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: split out the switchdev portion of br_mdb_notify
Vladimir Oltean [Wed, 27 Oct 2021 16:21:17 +0000 (19:21 +0300)]
net: bridge: split out the switchdev portion of br_mdb_notify

Similar to fdb_notify() and br_switchdev_fdb_notify(), split the
switchdev specific logic from br_mdb_notify() into a different function.
This will be moved later in br_switchdev.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: move br_vlan_replay to br_switchdev.c
Vladimir Oltean [Wed, 27 Oct 2021 16:21:16 +0000 (19:21 +0300)]
net: bridge: move br_vlan_replay to br_switchdev.c

br_vlan_replay() is relevant only if CONFIG_NET_SWITCHDEV is enabled, so
move it to br_switchdev.c.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bridge: provide shim definition for br_vlan_flags
Vladimir Oltean [Wed, 27 Oct 2021 16:21:15 +0000 (19:21 +0300)]
net: bridge: provide shim definition for br_vlan_flags

br_vlan_replay() needs this, and we're preparing to move it to
br_switchdev.c, which will be compiled regardless of whether or not
CONFIG_BRIDGE_VLAN_FILTERING is enabled.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mlxsw-offload-root-tbf-as-port-shaper'
Jakub Kicinski [Fri, 29 Oct 2021 02:57:37 +0000 (19:57 -0700)]
Merge branch 'mlxsw-offload-root-tbf-as-port-shaper'

Ido Schimmel says:

====================
mlxsw: Offload root TBF as port shaper

Petr says:

Egress configuration in an mlxsw deployment would generally have an ETS
qdisc at root, with a number of bands and a priority dispatch between them.
Some of those bands could then have a RED and/or TBF qdiscs attached.

When TBF is used like this, mlxsw configures shaper on a subgroup, which is
the pair of traffic classes (UC + BUM) corresponding to the band where TBF
is installed. This way it is possible to limit traffic on several bands
(subgroups) independently by configuring several TBF qdiscs, each on a
different band.

It is however not possible to limit traffic flowing through the port as
such. The ASIC supports this through port shapers (as opposed to the
abovementioned subgroup shapers). An obvious way to express this as a user
would be to configure a root TBF qdisc, and then add the whole ETS
hierarchy as its child.

TBF (and RED) can currently be used as a root qdisc. This usage has always
been accepted as a special case, when only one subgroup is configured, and
that is the subgroup that root TBF and RED configure. However it was never
possible to install ETS under that TBF.

In this patchset, this limitation is relaxed. TBF qdisc in root position is
now always offloaded as a port shaper. Such TBF qdisc does not limit
offload of further children. It is thus possible to configure the usual
priority classification through ETS, with RED and/or TBF on individual
bands, all that below a port-level TBF. For example:

    (1) # tc qdisc replace dev swp1 root handle 1: tbf rate 800mbit burst 16kb limit 1M
    (2) # tc qdisc replace dev swp1 parent 1:1 handle 11: ets strict 8 priomap 7 6 5 4 3 2 1 0
    (3) # tc qdisc replace dev swp1 parent 11:1 handle 111: tbf rate 600mbit burst 16kb limit 1M
    (4) # tc qdisc replace dev swp1 parent 11:2 handle 112: tbf rate 600mbit burst 16kb limit 1M

Here, (1) configures a 800-Mbps port shaper, (2) adds an ETS element with 8
strictly-prioritized bands, and (3) and (4) configure two more shapers,
each 600 Mbps, one under 11:1 (band 0, TCs 7 and 15), one under 11:2 (band
1, TCs 6 and 14). This way, traffic on bands 0 and 1 are each independently
capped at 600 Mbps, and at the same time, traffic through the port as a
whole is capped at 800 Mbps.

In patch #1, TBF is permitted as root qdisc, under which the usual qdisc
tree can be installed.

In patch #2, the qdisc offloadability selftest is extended to cover the
root TBF as well.

Patch #3 then tests that the offloaded TBF shapes as expected.
====================

Link: https://lore.kernel.org/r/20211027152001.1320496-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Test port shaper
Petr Machata [Wed, 27 Oct 2021 15:20:01 +0000 (18:20 +0300)]
selftests: mlxsw: Test port shaper

TBF can be used as a root qdisc, in which case it is supposed to configure
port shaper. Add a test that verifies that this is so by installing a root
TBF with a ETS or PRIO below it, and then expecting individual bands to all
be shaped according to the root TBF configuration.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mlxsw: Test offloadability of root TBF
Petr Machata [Wed, 27 Oct 2021 15:20:00 +0000 (18:20 +0300)]
selftests: mlxsw: Test offloadability of root TBF

TBF can be used as a root qdisc, with the usual ETS/RED/TBF hierarchy below
it. This use should now be offloaded. Add a test that verifies that it is.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomlxsw: spectrum_qdisc: Offload root TBF as port shaper
Petr Machata [Wed, 27 Oct 2021 15:19:59 +0000 (18:19 +0300)]
mlxsw: spectrum_qdisc: Offload root TBF as port shaper

The Spectrum ASIC allows configuration of maximum shaper on all levels of
the scheduling hierarchy: TCs, subgroups, groups and also ports. Currently,
TBF always configures a subgroup. But a user could reasonably express the
intent to configure port shaper by putting TBF to a root position, around
ETS / PRIO. Accept this usage and offload appropriately.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 28 Oct 2021 17:43:58 +0000 (10:43 -0700)]
Merge git://git./linux/kernel/git/netdev/net

include/net/sock.h
  7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable")
  4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout")

drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
  0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
  e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")

Adjacent code addition in both cases, keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 28 Oct 2021 17:17:31 +0000 (10:17 -0700)]
Merge tag 'net-5.15-rc8' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from WiFi (mac80211), and BPF.

  Current release - regressions:

   - skb_expand_head: adjust skb->truesize to fix socket memory
     accounting

   - mptcp: fix corrupt receiver key in MPC + data + checksum

  Previous releases - regressions:

   - multicast: calculate csum of looped-back and forwarded packets

   - cgroup: fix memory leak caused by missing cgroup_bpf_offline

   - cfg80211: fix management registrations locking, prevent list
     corruption

   - cfg80211: correct false positive in bridge/4addr mode check

   - tcp_bpf: fix race in the tcp_bpf_send_verdict resulting in reusing
     previous verdict

  Previous releases - always broken:

   - sctp: enhancements for the verification tag, prevent attackers from
     killing SCTP sessions

   - tipc: fix size validations for the MSG_CRYPTO type

   - mac80211: mesh: fix HE operation element length check, prevent out
     of bound access

   - tls: fix sign of socket errors, prevent positive error codes being
     reported from read()/write()

   - cfg80211: scan: extend RCU protection in
     cfg80211_add_nontrans_list()

   - implement ->sock_is_readable() for UDP and AF_UNIX, fix poll() for
     sockets in a BPF sockmap

   - bpf: fix potential race in tail call compatibility check resulting
     in two operations which would make the map incompatible succeeding

   - bpf: prevent increasing bpf_jit_limit above max

   - bpf: fix error usage of map_fd and fdget() in generic batch update

   - phy: ethtool: lock the phy for consistency of results

   - prevent infinite while loop in skb_tx_hash() when Tx races with
     driver reconfiguring the queue <> traffic class mapping

   - usbnet: fixes for bad HW conjured by syzbot

   - xen: stop tx queues during live migration, prevent UAF

   - net-sysfs: initialize uid and gid before calling
     net_ns_get_ownership

   - mlxsw: prevent Rx stalls under memory pressure"

* tag 'net-5.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  Revert "net: hns3: fix pause config problem after autoneg disabled"
  mptcp: fix corrupt receiver key in MPC + data + checksum
  riscv, bpf: Fix potential NULL dereference
  octeontx2-af: Fix possible null pointer dereference.
  octeontx2-af: Display all enabled PF VF rsrc_alloc entries.
  octeontx2-af: Check whether ipolicers exists
  net: ethernet: microchip: lan743x: Fix skb allocation failure
  net/tls: Fix flipped sign in async_wait.err assignment
  net/tls: Fix flipped sign in tls_err_abort() calls
  net/smc: Correct spelling mistake to TCPF_SYN_RECV
  net/smc: Fix smc_link->llc_testlink_time overflow
  nfp: bpf: relax prog rejection for mtu check through max_pkt_offset
  vmxnet3: do not stop tx queues after netif_device_detach()
  r8169: Add device 10ec:8162 to driver r8169
  ptp: Document the PTP_CLK_MAGIC ioctl number
  usbnet: fix error return code in usbnet_probe()
  net: hns3: adjust string spaces of some parameters of tx bd info in debugfs
  net: hns3: expand buffer len for some debugfs command
  net: hns3: add more string spaces for dumping packets number of queue info in debugfs
  net: hns3: fix data endian problem of some functions of debugfs
  ...

2 years agoMerge tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Thu, 28 Oct 2021 17:04:39 +0000 (10:04 -0700)]
Merge tag 'spi-fix-v5.15-rc7' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A couple of final driver specific fixes for v5.15, one fixing
  potential ID collisions between two instances of the Altera driver and
  one making Microwire full duplex mode actually work on pl022"

* tag 'spi-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spl022: fix Microwire full duplex mode
  spi: altera: Change to dynamic allocation of spi id

2 years agoMerge tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 28 Oct 2021 17:00:58 +0000 (10:00 -0700)]
Merge tag 'regmap-fix-v5.15-rc7' of git://git./linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "This fixes a potential double free when handling an out of memory
  error inserting a node into an rbtree regcache"

* tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Fix possible double-free in regcache_rbtree_exit()

2 years agoMerge tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Thu, 28 Oct 2021 16:55:25 +0000 (09:55 -0700)]
Merge tag 'linux-watchdog-5.15-rc7' of git://linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
 "I overlooked Guenters request to sent this upstream earlier, so it's a
  bit late in the release cycle.

  This contains:

   - Revert "watchdog: iTCO_wdt: Account for rebooting on second
     timeout"

   - sbsa: only use 32-bit accessors

   - sbsa: drop unneeded MODULE_ALIAS

   - ixp4xx_wdt: Fix address space warning

   - Fix OMAP watchdog early handling"

* tag 'linux-watchdog-5.15-rc7' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Fix OMAP watchdog early handling
  watchdog: ixp4xx_wdt: Fix address space warning
  watchdog: sbsa: drop unneeded MODULE_ALIAS
  watchdog: sbsa: only use 32-bit accessors
  Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"

2 years agoMerge tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rosted...
Linus Torvalds [Thu, 28 Oct 2021 16:50:56 +0000 (09:50 -0700)]
Merge tag 'trace-v5.15-rc6-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Do not WARN when attaching event probe to non-existent event

  If the user tries to attach an event probe (eprobe) to an event that
  does not exist, it will trigger a warning. There's an error check that
  only expects memory issues otherwise it is considered a bug. But
  changes in the code to move around the locking made it that it can
  error out if the user attempts to attach to an event that does not
  exist, returning an -ENODEV. As this path can be caused by user space
  putting in a bad value, do not trigger a WARN"

* tag 'trace-v5.15-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Do not warn when connecting eprobe to non existing event

2 years agoRevert "net: hns3: fix pause config problem after autoneg disabled"
Guangbin Huang [Thu, 28 Oct 2021 14:06:24 +0000 (22:06 +0800)]
Revert "net: hns3: fix pause config problem after autoneg disabled"

This reverts commit 3bda2e5df476417b6d08967e2d84234a59d57b1c.

According to discussion with Andrew as follow:
https://lore.kernel.org/netdev/09eda9fe-196b-006b-6f01-f54e75715961@huawei.com/

HNS3 driver needs to separate pause autoneg from general autoneg, so revert
this incorrect patch.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20211028140624.53149-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: fix corrupt receiver key in MPC + data + checksum
Davide Caratti [Wed, 27 Oct 2021 20:38:55 +0000 (13:38 -0700)]
mptcp: fix corrupt receiver key in MPC + data + checksum

using packetdrill it's possible to observe that the receiver key contains
random values when clients transmit MP_CAPABLE with data and checksum (as
specified in RFC8684 Â§3.1). Fix the layout of mptcp_out_options, to avoid
using the skb extension copy when writing the MP_CAPABLE sub-option.

Fixes: d7b269083786 ("mptcp: shrink mptcp_out_options struct")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/233
Reported-by: Poorva Sonparote <psonparo@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20211027203855.264600-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoriscv, bpf: Fix potential NULL dereference
Björn Töpel [Thu, 28 Oct 2021 12:51:15 +0000 (14:51 +0200)]
riscv, bpf: Fix potential NULL dereference

The bpf_jit_binary_free() function requires a non-NULL argument. When
the RISC-V BPF JIT fails to converge in NR_JIT_ITERATIONS steps,
jit_data->header will be NULL, which triggers a NULL
dereference. Avoid this by checking the argument, prior calling the
function.

Fixes: ca6cb5447cec ("riscv, bpf: Factor common RISC-V JIT code")
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20211028125115.514587-1-bjorn@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: virtio: use eth_hw_addr_set()
Jakub Kicinski [Wed, 27 Oct 2021 15:20:12 +0000 (08:20 -0700)]
net: virtio: use eth_hw_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Even though the current code uses dev->addr_len the we can switch
to eth_hw_addr_set() instead of dev_addr_set(). The netdev is
always allocated by alloc_etherdev_mq() and there are at least two
places which assume Ethernet address:
 - the line below calling eth_hw_addr_random()
 - virtnet_set_mac_address() -> eth_commit_mac_addr_change()

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20211027152012.3393077-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agodevlink: Simplify internal devlink params implementation
Leon Romanovsky [Thu, 28 Oct 2021 13:23:21 +0000 (16:23 +0300)]
devlink: Simplify internal devlink params implementation

Reduce extra indirection from devlink_params_*() API. Such change
makes it clear that we can drop devlink->lock from these flows, because
everything is executed when the devlink is not registered yet.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'octeontx2-debugfs-updates'
David S. Miller [Thu, 28 Oct 2021 13:49:29 +0000 (14:49 +0100)]
Merge branch 'octeontx2-debugfs-updates'

Rakesh Babu Saladi says:

====================
RVU Debugfs updates.

Patch 1: Few minor changes such as spelling mistakes, deleting unwanted
characters, etc.
Patch 2: Add debugfs dump for lmtst map table
Patch 3: Add channel and channel mask in debugfs.

Changes made from v2 to v3:
1. In patch 1 moved few lines and submitted those changes as a
different patch to net branch
2. Patch 2 is left unchanged.
3. Patch 3 is left unchanged.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: debugfs: Add channel and channel mask.
Rakesh Babu [Wed, 27 Oct 2021 18:07:45 +0000 (23:37 +0530)]
octeontx2-af: debugfs: Add channel and channel mask.

This patch is to dispaly channel and channel_mask for each RX
interface of NPC MCAM rule.

Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: cn10k: debugfs for dumping LMTST map table
Harman Kalra [Wed, 27 Oct 2021 18:07:44 +0000 (23:37 +0530)]
octeontx2-af: cn10k: debugfs for dumping LMTST map table

CN10k SoCs use atomic stores of up to 128 bytes to submit
packets/instructions into co-processor cores. The enqueueing is performed
using Large Memory Transaction Store (LMTST) operations. They allow for
lockless enqueue operations - i.e., two different CPU cores can submit
instructions to the same queue without needing to lock the queue or
synchronize their accesses.

This patch implements a new debugfs entry for dumping LMTST map
table present on CN10K, as this might be very useful to debug any issue
in case of shared LMTST region among multiple pci functions.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Bhaskara Budiredla <bbudiredla@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: debugfs: Minor changes.
Rakesh Babu Saladi [Wed, 27 Oct 2021 18:07:43 +0000 (23:37 +0530)]
octeontx2-af: debugfs: Minor changes.

Few changes in rvu_debugfs.c file to remove unwanted characters,
indenting the code, added a new comment line etc.

Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'octeontx2-debugfs-fixes'
David S. Miller [Thu, 28 Oct 2021 13:47:37 +0000 (14:47 +0100)]
Merge branch 'octeontx2-debugfs-fixes'

Rakesh Babu Saladi says:

====================
RVU Debugfs fix updates.

The following patch series consists of the patch fixes done over
rvu_debugfs.c and rvu_nix.c files.

Patch 1: Check and return if ipolicers do not exists.
Patch 2: Fix rsrc_alloc to print all enabled PF/VF entries with list of LFs
allocated for each functional block.
Patch 3: Fix possible null pointer dereference.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: Fix possible null pointer dereference.
Rakesh Babu Saladi [Wed, 27 Oct 2021 17:32:34 +0000 (23:02 +0530)]
octeontx2-af: Fix possible null pointer dereference.

This patch fixes possible null pointer dereference in files
"rvu_debugfs.c" and "rvu_nix.c"

Fixes: 8756828a8148 ("octeontx2-af: Add NPA aura and pool contexts to debugfs")
Fixes: 9a946def264d ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: Display all enabled PF VF rsrc_alloc entries.
Rakesh Babu [Wed, 27 Oct 2021 17:32:33 +0000 (23:02 +0530)]
octeontx2-af: Display all enabled PF VF rsrc_alloc entries.

Currently, we are using a fixed buffer size of length 2048 to display
rsrc_alloc output. As a result a maximum of 2048 characters of
rsrc_alloc output is displayed, which may lead sometimes to display only
partial output. This patch fixes this dependency on max limit of buffer
size and displays all PF VF entries.

Each column of the debugfs entry "rsrc_alloc" uses a fixed width of 12
characters to print the list of LFs of each block for a PF/VF. If the
length of list of LFs of a block exceeds this fixed width then the list
gets truncated and displays only a part of the list. This patch fixes
this by using the maximum possible length of list of LFs among all
blocks of all PFs and VFs entries as the width size.

Fixes: f7884097141b ("octeontx2-af: Formatting debugfs entry rsrc_alloc.")
Fixes: 23205e6d06d4 ("octeontx2-af: Dump current resource provisioning status")
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: Check whether ipolicers exists
Subbaraya Sundeep [Wed, 27 Oct 2021 17:32:32 +0000 (23:02 +0530)]
octeontx2-af: Check whether ipolicers exists

While displaying ingress policers information in
debugfs check whether ingress policers exist in
the hardware or not because some platforms(CN9XXX)
do not have this feature.

Fixes: e7d8971763f3 ("octeontx2-af: cn10k: Debugfs support for bandwidth")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ethernet: microchip: lan743x: Fix skb allocation failure
Yuiko Oshino [Wed, 27 Oct 2021 18:23:02 +0000 (14:23 -0400)]
net: ethernet: microchip: lan743x: Fix skb allocation failure

The driver allocates skb during ndo_open with GFP_ATOMIC which has high chance of failure when there are multiple instances.
GFP_KERNEL is enough while open and use GFP_ATOMIC only from interrupt context.

Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: microchip_t1: add cable test support for lan87xx phy
Yuiko Oshino [Wed, 27 Oct 2021 18:20:51 +0000 (14:20 -0400)]
net: phy: microchip_t1: add cable test support for lan87xx phy

Add a basic cable test (diagnostic) support for lan87xx phy.
Tested with LAN8770 for connected/open/short wires using ethtool.

Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoptp: fix code indentation issues
Carlos Llamas [Wed, 27 Oct 2021 23:18:02 +0000 (23:18 +0000)]
ptp: fix code indentation issues

This fixes the following checkpatch.pl errors:

ERROR: code indent should use tabs where possible
+^I        if (ptp->pps_source)$

ERROR: code indent should use tabs where possible
+^I                pps_unregister_source(ptp->pps_source);$

ERROR: code indent should use tabs where possible
+^I                kthread_destroy_worker(ptp->kworker);$

Fixes: 4225fea1cb28 ("ptp: Fix possible memory leak in ptp_clock_register()")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tls: Fix flipped sign in async_wait.err assignment
Daniel Jordan [Wed, 27 Oct 2021 21:59:21 +0000 (17:59 -0400)]
net/tls: Fix flipped sign in async_wait.err assignment

sk->sk_err contains a positive number, yet async_wait.err wants the
opposite.  Fix the missed sign flip, which Jakub caught by inspection.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tls: Fix flipped sign in tls_err_abort() calls
Daniel Jordan [Wed, 27 Oct 2021 21:59:20 +0000 (17:59 -0400)]
net/tls: Fix flipped sign in tls_err_abort() calls

sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,

    [kworker]
    tls_encrypt_done(..., err=<negative error from crypto request>)
      tls_err_abort(.., err)
        sk->sk_err = err;

    [task]
    splice_from_pipe_feed
      ...
        tls_sw_do_sendpage
          if (sk->sk_err) {
            ret = -sk->sk_err;  // ret is positive

    splice_from_pipe_feed (continued)
      ret = actor(...)  // ret is still positive and interpreted as bytes
                        // written, resulting in underflow of buf->len and
                        // sd->len, leading to huge buf->offset and bogus
                        // addresses computed in later calls to actor()

Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.

Cc: stable@vger.kernel.org
Fixes: c46234ebb4d1e ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: cleanup __sk_stream_memory_free()
Eric Dumazet [Thu, 28 Oct 2021 01:12:05 +0000 (18:12 -0700)]
net: cleanup __sk_stream_memory_free()

We now have INDIRECT_CALL_INET_1() macro, no need to use #ifdef CONFIG_INET

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosky2: Remove redundant assignment and parentheses
luo penghao [Thu, 28 Oct 2021 03:15:51 +0000 (03:15 +0000)]
sky2: Remove redundant assignment and parentheses

The variable err will be reassigned on subsequent branches, and this
assignment does not perform related value operations. This will cause
the double parentheses to be redundant, so the inner parentheses should
be deleted.

clang_analyzer complains as follows:

drivers/net/ethernet/marvell/sky2.c:4988: warning:

Although the value stored to 'err' is used in the enclosing expression,
the value is never actually read from 'err'.

Changes in v2:

modify title category:octeontx2-af to sky2.
delete the inner parentheses.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ipconfig: Release the rtnl_lock while waiting for carrier
Maxime Chevallier [Thu, 28 Oct 2021 13:18:04 +0000 (15:18 +0200)]
net: ipconfig: Release the rtnl_lock while waiting for carrier

While waiting for a carrier to come on one of the netdevices, some
devices will require to take the rtnl lock at some point to fully
initialize all parts of the link.

That's the case for SFP, where the rtnl is taken when a module gets
detected. This prevents mounting an NFS rootfs over an SFP link.

This means that while ipconfig waits for carriers to be detected, no SFP
modules can be detected in the meantime, it's only detected after
ipconfig times out.

This commit releases the rtnl_lock while waiting for the carrier to come
up, and re-takes it to check the for the init device and carrier status.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodevlink: add documentation for octeontx2 driver
Subbaraya Sundeep [Thu, 28 Oct 2021 13:08:15 +0000 (18:38 +0530)]
devlink: add documentation for octeontx2 driver

Add a file to document devlink support for octeontx2
driver. Driver-specific parameters implemented by
AF, PF and VF drivers are documented.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosch_htb: Add extack messages for EOPNOTSUPP errors
Maxim Mikityanskiy [Thu, 28 Oct 2021 12:24:36 +0000 (15:24 +0300)]
sch_htb: Add extack messages for EOPNOTSUPP errors

In order to make the "Operation not supported" message clearer to the
user, add extack messages explaining why exactly adding offloaded HTB
could be not supported in each case.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'SMC-fixes'
David S. Miller [Thu, 28 Oct 2021 12:04:29 +0000 (13:04 +0100)]
Merge branch 'SMC-fixes'

Tony Lu says:

====================
Fixes for SMC

There are some fixes for SMC.

v1->v2:
- fix wrong email address.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/smc: Correct spelling mistake to TCPF_SYN_RECV
Wen Gu [Thu, 28 Oct 2021 07:13:47 +0000 (15:13 +0800)]
net/smc: Correct spelling mistake to TCPF_SYN_RECV

There should use TCPF_SYN_RECV instead of TCP_SYN_RECV.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/smc: Fix smc_link->llc_testlink_time overflow
Tony Lu [Thu, 28 Oct 2021 07:13:45 +0000 (15:13 +0800)]
net/smc: Fix smc_link->llc_testlink_time overflow

The value of llc_testlink_time is set to the value stored in
net->ipv4.sysctl_tcp_keepalive_time when linkgroup init. The value of
sysctl_tcp_keepalive_time is already jiffies, so we don't need to
multiply by HZ, which would cause smc_link->llc_testlink_time overflow,
and test_link send flood.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'mlx5-net-next-5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Thu, 28 Oct 2021 12:02:44 +0000 (13:02 +0100)]
Merge tag 'mlx5-net-next-5.15-rc7' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Merge mlx5-next into net-next
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonfp: bpf: relax prog rejection for mtu check through max_pkt_offset
Yu Xiao [Thu, 28 Oct 2021 10:00:36 +0000 (12:00 +0200)]
nfp: bpf: relax prog rejection for mtu check through max_pkt_offset

MTU change is refused whenever the value of new MTU is bigger than
the max packet bytes that fits in NFP Cluster Target Memory (CTM).
However, an eBPF program doesn't always need to access the whole
packet data.

The maximum direct packet access (DPA) offset has always been
caculated by verifier and stored in the max_pkt_offset field of prog
aux data.

Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Niklas Soderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'mvpp2-phylink'
David S. Miller [Thu, 28 Oct 2021 11:55:44 +0000 (12:55 +0100)]
Merge branch 'mvpp2-phylink'

Russell King says:

====================
Convert mvpp2 to phylink supported_interfaces

This patch series converts mvpp2 to use phylinks supported_interfaces
bitmap to simplify the validate() implementation. The patches:

1) Add the supported interface modes the supported_interfaces bitmap.
2) Removes the checks for the interface type being supported from
   the validate callback
3) Removes the now unnecessary checks and call to
   phylink_helper_basex_speed() to support switching between
   1000base-X and 2500base-X for SFPs
4) Cleans up the resulting validate() code.

(3) becomes possible because when asking the MAC for its complete
support, we walk all supported interfaces which will include 1000base-X
and 2500base-X only if the comphy is present.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mvpp2: clean up mvpp2_phylink_validate()
Russell King (Oracle) [Wed, 27 Oct 2021 09:49:29 +0000 (10:49 +0100)]
net: mvpp2: clean up mvpp2_phylink_validate()

mvpp2_phylink_validate() no longer needs to check for
PHY_INTERFACE_MODE_NA as phylink will walk the supported interface
types to discover the link mode capabilities. Remove these checks.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mvpp2: drop use of phylink_helper_basex_speed()
Russell King (Oracle) [Wed, 27 Oct 2021 09:49:24 +0000 (10:49 +0100)]
net: mvpp2: drop use of phylink_helper_basex_speed()

Now that we have a better method to select SFP interface modes, we
no longer need to use phylink_helper_basex_speed() in a driver's
validation function, and we can also get rid of our hack to indicate
both 1000base-X and 2500base-X if the comphy is present to make that
work. Remove this hack and use of phylink_helper_basex_speed().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mvpp2: remove interface checks in mvpp2_phylink_validate()
Russell King (Oracle) [Wed, 27 Oct 2021 09:49:19 +0000 (10:49 +0100)]
net: mvpp2: remove interface checks in mvpp2_phylink_validate()

As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode in the
validation function. Remove this to simplify it.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mvpp2: populate supported_interfaces member
Russell King [Wed, 27 Oct 2021 09:49:14 +0000 (10:49 +0100)]
net: mvpp2: populate supported_interfaces member

Populate the phy interface mode bitmap for the Marvell mvpp2 driver
with interfaces modes supported by the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv6: enable net.ipv6.route.max_size sysctl in network namespace
Alexander Kuznetsov [Wed, 27 Oct 2021 08:00:08 +0000 (11:00 +0300)]
ipv6: enable net.ipv6.route.max_size sysctl in network namespace

We want to increase route cache size in network namespace
created with user namespace. Currently ipv6 route settings
are disabled for non-initial network namespaces.
We can allow this sysctl and it will be safe since
commit <6126891c6d4f> because route cache account to kmem,
that is why users from user namespace can not DOS system.

Signed-off-by: Alexander Kuznetsov <wwfq@yandex-team.ru>
Acked-by: Dmitry Yakunin <zeil@yandex-team.ru>
Acked-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agovmxnet3: do not stop tx queues after netif_device_detach()
Dongli Zhang [Tue, 26 Oct 2021 21:50:31 +0000 (14:50 -0700)]
vmxnet3: do not stop tx queues after netif_device_detach()

The netif_device_detach() conditionally stops all tx queues if the queues
are running. There is no need to call netif_tx_stop_all_queues() again.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agompt fusion: use dev_addr_set()
Jakub Kicinski [Tue, 26 Oct 2021 17:55:00 +0000 (10:55 -0700)]
mpt fusion: use dev_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agofirewire: don't write directly to netdev->dev_addr
Jakub Kicinski [Tue, 26 Oct 2021 17:53:52 +0000 (10:53 -0700)]
firewire: don't write directly to netdev->dev_addr

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Prepare fwnet_hwaddr on the stack and use dev_addr_set() to copy
it to netdev->dev_addr. We no longer need to worry about alignment.
union fwnet_hwaddr does not have any padding and we set all fields
so we don't need to zero it upfront.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomedia: use eth_hw_addr_set()
Jakub Kicinski [Tue, 26 Oct 2021 17:52:50 +0000 (10:52 -0700)]
media: use eth_hw_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Convert media from memcpy(... 6) and memcpy(... addr_len) to
eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - memcpy(dev->dev_addr, np, 6)
  + eth_hw_addr_set(dev, np)
  @@
  - memcpy(dev->dev_addr, np, dev->addr_len)
  + eth_hw_addr_set(dev, np)

Make sure we don't cast off const qualifier from dev->dev_addr.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'tcp-tx-side-cleanups'
David S. Miller [Thu, 28 Oct 2021 11:44:40 +0000 (12:44 +0100)]
Merge branch 'tcp-tx-side-cleanups'

Eric Dumazet says:

====================
tcp: tx side cleanups

We no longer need to set skb->reserved_tailroom because
TCP sendmsg() do not put payload in skb->head anymore.

Also do some cleanups around skb->ip_summed/csum,
and CP_SKB_CB(skb)->sacked for fresh skbs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: do not clear TCP_SKB_CB(skb)->sacked if already zero
Eric Dumazet [Wed, 27 Oct 2021 20:19:23 +0000 (13:19 -0700)]
tcp: do not clear TCP_SKB_CB(skb)->sacked if already zero

Freshly allocated skbs have zero in skb->cb[] already.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: do not clear skb->csum if already zero
Eric Dumazet [Wed, 27 Oct 2021 20:19:22 +0000 (13:19 -0700)]
tcp: do not clear skb->csum if already zero

Freshly allocated skbs have their csum field cleared already.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: factorize ip_summed setting
Eric Dumazet [Wed, 27 Oct 2021 20:19:21 +0000 (13:19 -0700)]
tcp: factorize ip_summed setting

Setting skb->ip_summed to CHECKSUM_PARTIAL can be centralized
in tcp_stream_alloc_skb() and __mptcp_do_alloc_tx_skb()
instead of being done multiple times.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: no longer set skb->reserved_tailroom
Eric Dumazet [Wed, 27 Oct 2021 20:19:20 +0000 (13:19 -0700)]
tcp: no longer set skb->reserved_tailroom

TCP/MPTCP sendmsg() no longer puts payload in skb->head,
we can remove not needed code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: remove dead code from tcp_collapse_retrans()
Eric Dumazet [Wed, 27 Oct 2021 20:19:19 +0000 (13:19 -0700)]
tcp: remove dead code from tcp_collapse_retrans()

TCP sendmsg() no longer puts payload in skb->head,
remove some dead code from tcp_collapse_retrans().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: cleanup tcp_remove_empty_skb() use
Eric Dumazet [Wed, 27 Oct 2021 20:19:18 +0000 (13:19 -0700)]
tcp: cleanup tcp_remove_empty_skb() use

All tcp_remove_empty_skb() callers now use tcp_write_queue_tail()
for the skb argument, we can therefore factorize code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: remove dead code from tcp_sendmsg_locked()
Eric Dumazet [Wed, 27 Oct 2021 20:19:17 +0000 (13:19 -0700)]
tcp: remove dead code from tcp_sendmsg_locked()

TCP sendmsg() no longer puts payload in skb head, we can remove
dead code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Saeed Mahameed [Wed, 27 Oct 2021 19:45:37 +0000 (12:45 -0700)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux into net-next

Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agotracing: Do not warn when connecting eprobe to non existing event
Steven Rostedt (VMware) [Wed, 27 Oct 2021 16:08:54 +0000 (12:08 -0400)]
tracing: Do not warn when connecting eprobe to non existing event

When the syscall trace points are not configured in, the kselftests for
ftrace will try to attach an event probe (eprobe) to one of the system
call trace points. This triggered a WARNING, because the failure only
expects to see memory issues. But this is not the only failure. The user
may attempt to attach to a non existent event, and the kernel must not
warn about it.

Link: https://lkml.kernel.org/r/20211027120854.0680aa0f@gandalf.local.home
Fixes: 7491e2c442781 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2 years agonet: phy: Fix unsigned comparison with less than zero
Jiapeng Chong [Wed, 27 Oct 2021 08:59:51 +0000 (16:59 +0800)]
net: phy: Fix unsigned comparison with less than zero

Fix the following coccicheck warning:

./drivers/net/phy/at803x.c:493:5-10: WARNING: Unsigned expression
compared with zero: value < 0.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: 7beecaf7d507 ("net: phy: at803x: improve the WOL feature")
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/1635325191-101815-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mptcp-rework-fwd-memory-allocation-and-one-cleanup'
Jakub Kicinski [Thu, 28 Oct 2021 01:20:31 +0000 (18:20 -0700)]
Merge branch 'mptcp-rework-fwd-memory-allocation-and-one-cleanup'

Mat Martineau says:

====================
mptcp: Rework fwd memory allocation and one cleanup

These patches from the MPTCP tree rework forward memory allocation for
MPTCP (with some supporting changes in the net core), and also clean up
an unused function parameter.

Patch 1 updates TCP code but does not change any behavior, and creates
some macros for reclaim thresholds that will be reused in the MPTCP
code.

Patch 2 adds sk_forward_alloc_get() to the networking core to support
MPTCP's forward allocation with the diag interface.

Patch 3 reworks forward memory for MPTCP.

Patch 4 removes an unused arg and has no functional changes.
====================

Link: https://lore.kernel.org/r/20211026232916.179450-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: drop unused sk in mptcp_push_release
Geliang Tang [Tue, 26 Oct 2021 23:29:16 +0000 (16:29 -0700)]
mptcp: drop unused sk in mptcp_push_release

Since mptcp_set_timeout() had removed from mptcp_push_release() in
commit 33d41c9cd74c5 ("mptcp: more accurate timeout"), the argument
sk in mptcp_push_release() became useless. Let's drop it.

Fixes: 33d41c9cd74c5 ("mptcp: more accurate timeout")
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: allocate fwd memory separately on the rx and tx path
Paolo Abeni [Tue, 26 Oct 2021 23:29:15 +0000 (16:29 -0700)]
mptcp: allocate fwd memory separately on the rx and tx path

All the mptcp receive path is protected by the msk socket
spinlock. As consequences, the tx path has to play a few tricks to
allocate the forward memory without acquiring the spinlock multiple
times, making the overall TX path quite complex.

This patch tries to clean-up a bit the tx path, using completely
separated fwd memory allocation, for the rx and the tx path.

The forward memory allocated in the rx path is now accounted in
msk->rmem_fwd_alloc and is (still) protected by the msk socket spinlock.

To cope with the above we provide a few MPTCP-specific variants for
the helpers to charge, uncharge, reclaim and free the forward memory
in the receive path.

msk->sk_forward_alloc now accounts only the forward memory for the tx
path, we can use the plain core sock helper to manipulate it and drop
quite a bit of complexity.

On memory pressure, both rx and tx fwd memories are reclaimed.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: introduce sk_forward_alloc_get()
Paolo Abeni [Tue, 26 Oct 2021 23:29:14 +0000 (16:29 -0700)]
net: introduce sk_forward_alloc_get()

A later patch will change the MPTCP memory accounting schema
in such a way that MPTCP sockets will encode the total amount of
forward allocated memory in two separate fields (one for tx and
one for rx).

MPTCP sockets will use their own helper to provide the accurate
amount of fwd allocated memory.

To allow the above, this patch adds a new, optional, sk method to
fetch the fwd memory, wrap the call in a new helper and use it
where it is appropriate.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: define macros for a couple reclaim thresholds
Paolo Abeni [Tue, 26 Oct 2021 23:29:13 +0000 (16:29 -0700)]
tcp: define macros for a couple reclaim thresholds

A following patch is going to implement a similar reclaim schema
for the MPTCP protocol, with different locking.

Let's define a couple of macros for the used thresholds, so
that the latter code will be more easily maintainable.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoinet: remove races in inet{6}_getname()
Eric Dumazet [Tue, 26 Oct 2021 21:30:14 +0000 (14:30 -0700)]
inet: remove races in inet{6}_getname()

syzbot reported data-races in inet_getname() multiple times,
it is time we fix this instead of pretending applications
should not trigger them.

getsockname() and getpeername() are not really considered fast path.

v2: added the missing BPF_CGROUP_RUN_SA_PROG() declaration
    needed when CONFIG_CGROUP_BPF=n, as reported by
    kernel test robot <lkp@intel.com>

syzbot typical report:

BUG: KCSAN: data-race in __inet_hash_connect / inet_getname

write to 0xffff888136d66cf8 of 2 bytes by task 14374 on cpu 1:
 __inet_hash_connect+0x7ec/0x950 net/ipv4/inet_hashtables.c:831
 inet_hash_connect+0x85/0x90 net/ipv4/inet_hashtables.c:853
 tcp_v4_connect+0x782/0xbb0 net/ipv4/tcp_ipv4.c:275
 __inet_stream_connect+0x156/0x6e0 net/ipv4/af_inet.c:664
 inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:728
 __sys_connect_file net/socket.c:1896 [inline]
 __sys_connect+0x254/0x290 net/socket.c:1913
 __do_sys_connect net/socket.c:1923 [inline]
 __se_sys_connect net/socket.c:1920 [inline]
 __x64_sys_connect+0x3d/0x50 net/socket.c:1920
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888136d66cf8 of 2 bytes by task 14408 on cpu 0:
 inet_getname+0x11f/0x170 net/ipv4/af_inet.c:790
 __sys_getsockname+0x11d/0x1b0 net/socket.c:1946
 __do_sys_getsockname net/socket.c:1961 [inline]
 __se_sys_getsockname net/socket.c:1958 [inline]
 __x64_sys_getsockname+0x3e/0x50 net/socket.c:1958
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x44/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0x0000 -> 0xdee0

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 14408 Comm: syz-executor.3 Not tainted 5.15.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20211026213014.3026708-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoxdp: Remove redundant warning
Yajun Deng [Wed, 27 Oct 2021 01:38:56 +0000 (09:38 +0800)]
xdp: Remove redundant warning

There is a warning in xdp_rxq_info_unreg_mem_model() when reg_state isn't
equal to REG_STATE_REGISTERED, so the warning in xdp_rxq_info_unreg() is
redundant.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/20211027013856.1866-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: thunderbolt: use eth_hw_addr_set()
Jakub Kicinski [Tue, 26 Oct 2021 17:55:47 +0000 (10:55 -0700)]
net: thunderbolt: use eth_hw_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Link: https://lore.kernel.org/r/20211026175547.3198242-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agostaging: use of_get_ethdev_address()
Jakub Kicinski [Tue, 26 Oct 2021 17:50:38 +0000 (10:50 -0700)]
staging: use of_get_ethdev_address()

Use the new of_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.

  @@
  expression dev, np;
  @@
  - of_get_mac_address(np, dev->dev_addr)
  + of_get_ethdev_address(np, dev)

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20211026175038.3197397-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: macb: Fix mdio child node detection
Guenter Roeck [Tue, 26 Oct 2021 17:39:50 +0000 (10:39 -0700)]
net: macb: Fix mdio child node detection

Commit 4d98bb0d7ec2 ("net: macb: Use mdio child node for MDIO bus if it
exists") added code to detect if a 'mdio' child node exists to the macb
driver. Ths added code does, however, not actually check if the child node
exists, but if the parent node exists. This results in errors such as

macb 10090000.ethernet eth0: Could not attach PHY (-19)

if there is no 'mdio' child node. Fix the code to actually check for
the child node.

Fixes: 4d98bb0d7ec2 ("net: macb: Use mdio child node for MDIO bus if it exists")
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Link: https://lore.kernel.org/r/20211026173950.353636-1-linux@roeck-us.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: sch: simplify condtion for selecting mini_Qdisc_pair buffer
Seth Forshee [Tue, 26 Oct 2021 18:37:21 +0000 (13:37 -0500)]
net: sch: simplify condtion for selecting mini_Qdisc_pair buffer

The only valid values for a miniq pointer are NULL or a pointer to
miniq1 or miniq2, so testing for miniq_old != &miniq1 is functionally
equivalent to testing that it is NULL or equal to &miniq2.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Link: https://lore.kernel.org/r/20211026183721.137930-1-seth@forshee.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
Seth Forshee [Tue, 26 Oct 2021 13:06:59 +0000 (08:06 -0500)]
net: sch: eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

Currently rcu_barrier() is used to ensure that no readers of the
inactive mini_Qdisc buffer remain before it is reused. This waits for
any pending RCU callbacks to complete, when all that is actually
required is to wait for one RCU grace period to elapse after the buffer
was made inactive. This means that using rcu_barrier() may result in
unnecessary waits.

To improve this, store the current RCU state when a buffer is made
inactive and use poll_state_synchronize_rcu() to check whether a full
grace period has elapsed before reusing it. If a full grace period has
not elapsed, wait for a grace period to elapse, and in the non-RT case
use synchronize_rcu_expedited() to hasten it.

Since this approach eliminates the RCU callback it is no longer
necessary to synchronize_rcu() in the tp_head==NULL case. However, the
RCU state should still be saved for the previously active buffer.

Before this change I would typically see mini_qdisc_pair_swap() take
tens of milliseconds to complete. After this change it typcially
finishes in less than 1 ms, and often it takes just a few microseconds.

Thanks to Paul for walking me through the options for improving this.

Cc: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Link: https://lore.kernel.org/r/20211026130700.121189-1-seth@forshee.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agor8169: Add device 10ec:8162 to driver r8169
Janghyub Seo [Tue, 26 Oct 2021 07:12:42 +0000 (07:12 +0000)]
r8169: Add device 10ec:8162 to driver r8169

This patch makes the driver r8169 pick up device Realtek Semiconductor Co.
, Ltd. Device [10ec:8162].

Signed-off-by: Janghyub Seo <jhyub06@gmail.com>
Suggested-by: Rushab Shah <rushabshah32@gmail.com>
Link: https://lore.kernel.org/r/1635231849296.1489250046.441294000@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoptp: Document the PTP_CLK_MAGIC ioctl number
Randy Dunlap [Sun, 24 Oct 2021 16:38:31 +0000 (09:38 -0700)]
ptp: Document the PTP_CLK_MAGIC ioctl number

Add PTP_CLK_MAGIC to the userspace-api/ioctl/ioctl-number.rst
documentation file.

Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20211024163831.10200-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Wed, 27 Oct 2021 20:15:05 +0000 (13:15 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "A couple of fixes that seem important enough to pick at the last
  moment"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio-ring: fix DMA metadata flags
  vduse: Fix race condition between resetting and irq injecting
  vduse: Disallow injecting interrupt before DRIVER_OK is set

2 years agovirtio-ring: fix DMA metadata flags
Vincent Whitchurch [Tue, 26 Oct 2021 13:31:00 +0000 (15:31 +0200)]
virtio-ring: fix DMA metadata flags

The flags are currently overwritten, leading to the wrong direction
being passed to the DMA unmap functions.

Fixes: 72b5e8958738aaa4 ("virtio-ring: store DMA metadata in desc_extra for split virtqueue")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20211026133100.17541-1-vincent.whitchurch@axis.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agonet: sched: gred: dynamically allocate tc_gred_qopt_offload
Arnd Bergmann [Tue, 26 Oct 2021 10:07:11 +0000 (12:07 +0200)]
net: sched: gred: dynamically allocate tc_gred_qopt_offload

The tc_gred_qopt_offload structure has grown too big to be on the
stack for 32-bit architectures after recent changes.

net/sched/sch_gred.c:903:13: error: stack frame size (1180) exceeds limit (1024) in 'gred_destroy' [-Werror,-Wframe-larger-than]
net/sched/sch_gred.c:310:13: error: stack frame size (1212) exceeds limit (1024) in 'gred_offload' [-Werror,-Wframe-larger-than]

Use dynamic allocation per qdisc to avoid this.

Fixes: 50dc9a8572aa ("net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types")
Fixes: 67c9e6270f30 ("net: sched: Protect Qdisc::bstats with u64_stats")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20211026100711.nalhttf6mbe6sudx@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agousbnet: fix error return code in usbnet_probe()
Wang Hai [Tue, 26 Oct 2021 12:40:15 +0000 (20:40 +0800)]
usbnet: fix error return code in usbnet_probe()

Return error code if usb_maxpacket() returns 0 in usbnet_probe()

Fixes: 397430b50a36 ("usbnet: sanity check for maxpacket")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211026124015.3025136-1-wanghai38@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'two-reverts-to-calm-down-devlink-discussion'
Jakub Kicinski [Wed, 27 Oct 2021 18:58:11 +0000 (11:58 -0700)]
Merge branch 'two-reverts-to-calm-down-devlink-discussion'

Leon Romanovsky says:

====================
Two reverts to calm down devlink discussion

Two reverts as was discussed in [1], fast, easy and wrong in long run
solution to syzkaller bug [2].

[1] https://lore.kernel.org/all/20211026120234.3408fbcc@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
[2] https://lore.kernel.org/netdev/000000000000af277405cf0a7ef0@google.com/
====================

Link: https://lore.kernel.org/r/cover.1635276828.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "devlink: Remove not-executed trap policer notifications"
Leon Romanovsky [Tue, 26 Oct 2021 19:40:42 +0000 (22:40 +0300)]
Revert "devlink: Remove not-executed trap policer notifications"

This reverts commit 22849b5ea5952d853547cc5e0651f34a246b2a4f as it
revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters
devlink objects during another devlink user triggered command.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoRevert "devlink: Remove not-executed trap group notifications"
Leon Romanovsky [Tue, 26 Oct 2021 19:40:41 +0000 (22:40 +0300)]
Revert "devlink: Remove not-executed trap group notifications"

This reverts commit 8bbeed4858239ac956a78e5cbaf778bd6f3baef8 as it
revealed that mlxsw and netdevsim (copy/paste from mlxsw) reregisters
devlink objects during another devlink user triggered command.

Fixes: 22849b5ea595 ("devlink: Remove not-executed trap policer notifications")
Reported-by: syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Wed, 27 Oct 2021 17:41:59 +0000 (10:41 -0700)]
Merge tag 'trace-v5.15-rc6' of git://git./linux/kernel/git/rostedt/linux-trace

Pull nds32 tracing fix from Steven Rostedt:
 "Fix nds32le build when DYNAMIC_FTRACE is disabled

  A randconfig found that nds32le architecture fails to build due to a
  prototype mismatch between a ftrace function pointer and the function
  it was to be assigned to. That function pointer prototype missed being
  updated when all the ftrace callbacks were updated"

* tag 'trace-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace/nds32: Update the proto for ftrace_trace_function to match ftrace_stub

2 years agoMerge tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 27 Oct 2021 17:19:43 +0000 (10:19 -0700)]
Merge tag 'nios2_fixes_for_v5.15_part3' of git://git./linux/kernel/git/dinguyen/linux

Pull nios2 fix from Dinh Nguyen:
 "Fix a build error for allmodconfig"

* tag 'nios2_fixes_for_v5.15_part3' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: Make NIOS2_DTB_SOURCE_BOOL depend on !COMPILE_TEST

2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Wed, 27 Oct 2021 17:01:17 +0000 (10:01 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Nothing very exciting here, it has been a quiet cycle overall. Usual
  collection of small bug fixes:

   - irdma issues with CQ entries, VLAN completions and a mutex deadlock

   - Incorrect DCT packets in mlx5

   - Userspace triggered overflows in qib

   - Locking error in hfi

   - Typo in errno value in qib/hfi1

   - Double free in qedr

   - Leak of random kernel memory to userspace with a netlink callback"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  RDMA/irdma: Do not hold qos mutex twice on QP resume
  RDMA/irdma: Set VLAN in UD work completion correctly
  RDMA/mlx5: Initialize the ODP xarray when creating an ODP MR
  rdma/qedr: Fix crash due to redundant release of device's qp memory
  RDMA/rdmavt: Fix error code in rvt_create_qp()
  IB/hfi1: Fix abba locking issue with sc_disable()
  IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
  RDMA/mlx5: Set user priority for DCT
  RDMA/irdma: Process extended CQ entries correctly