OSDN Git Service

tomoyo/tomoyo-test1.git
11 months agomlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:22 +0000 (17:59 +0200)]
mlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with FIB events.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/5221a92e751c40447c55959f622267ccc999ed04.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:21 +0000 (17:59 +0200)]
mlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
switchdev module.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/774c3d7b5b0231f1435df2ec9dd660192e382756.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_nve: Do not take reference when looking up netdevice
Petr Machata [Thu, 27 Jul 2023 15:59:20 +0000 (17:59 +0200)]
mlxsw: spectrum_nve: Do not take reference when looking up netdevice

mlxsw_sp_nve_fid_disable() is always called under RTNL. It is therefore
safe to call __dev_get_by_index() to get the netdevice pointer without
bumping the reference count, because we can be sure the netdevice is not
going away. That then obviates the need to put the netdevice later in the
function.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/341d1046f89d8d839d9d00e4a3d58cdc351e9397.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put()
Petr Machata [Thu, 27 Jul 2023 15:59:19 +0000 (17:59 +0200)]
mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put()

As of commit 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor
initialization in work scheduler"), the functions
mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users.
Drop them.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: change accept_ra_min_rtr_lft to affect all RA lifetimes
Patrick Rohr [Wed, 26 Jul 2023 23:07:01 +0000 (16:07 -0700)]
net: change accept_ra_min_rtr_lft to affect all RA lifetimes

accept_ra_min_rtr_lft only considered the lifetime of the default route
and discarded entire RAs accordingly.

This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and
applies the value to individual RA sections; in particular, router
lifetime, PIO preferred lifetime, and RIO lifetime. If any of those
lifetimes are lower than the configured value, the specific RA section
is ignored.

In order for the sysctl to be useful to Android, it should really apply
to all lifetimes in the RA, since that is what determines the minimum
frequency at which RAs must be processed by the kernel. Android uses
hardware offloads to drop RAs for a fraction of the minimum of all
lifetimes present in the RA (some networks have very frequent RAs (5s)
with high lifetimes (2h)). Despite this, we have encountered networks
that set the router lifetime to 30s which results in very frequent CPU
wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the
WiFi firmware) entirely on such networks, it seems better to ignore the
misconfigured routers while still processing RAs from other IPv6 routers
on the same network (i.e. to support IoT applications).

The previous implementation dropped the entire RA based on router
lifetime. This turned out to be hard to expand to the other lifetimes
present in the RA in a consistent manner; dropping the entire RA based
on RIO/PIO lifetimes would essentially require parsing the whole thing
twice.

Fixes: 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'net-store-netdevs-in-an-xarray'
Jakub Kicinski [Fri, 28 Jul 2023 18:36:00 +0000 (11:36 -0700)]
Merge branch 'net-store-netdevs-in-an-xarray'

Jakub Kicinski says:

====================
net: store netdevs in an xarray

One of more annoying developer experience gaps we have in netlink
is iterating over netdevs. It's painful. Add an xarray to make
it trivial.

v1: https://lore.kernel.org/all/20230722014237.4078962-1-kuba@kernel.org/
====================

Link: https://lore.kernel.org/r/20230726185530.2247698-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: convert some netlink netdev iterators to depend on the xarray
Jakub Kicinski [Wed, 26 Jul 2023 18:55:30 +0000 (11:55 -0700)]
net: convert some netlink netdev iterators to depend on the xarray

Reap the benefits of easier iteration thanks to the xarray.
Convert just the genetlink ones, those are easier to test.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: store netdevs in an xarray
Jakub Kicinski [Wed, 26 Jul 2023 18:55:29 +0000 (11:55 -0700)]
net: store netdevs in an xarray

Iterating over the netdev hash table for netlink dumps is hard.
Dumps are done in "chunks" so we need to save the position
after each chunk, so we know where to restart from. Because
netdevs are stored in a hash table we remember which bucket
we were in and how many devices we dumped.

Since we don't hold any locks across the "chunks" - devices may
come and go while we're dumping. If that happens we may miss
a device (if device is deleted from the bucket we were in).
We indicate to user space that this may have happened by setting
NLM_F_DUMP_INTR. User space is supposed to dump again (I think)
if it sees that. Somehow I doubt most user space gets this right..

To illustrate let's look at an example:

               System state:
  start:       # [A, B, C]
  del:  B      # [A, C]

with the hash table we may dump [A, B], missing C completely even
tho it existed both before and after the "del B".

Add an xarray and use it to allocate ifindexes. This way we
can iterate ifindexes in order, without the worry that we'll
skip one. We may still generate a dump of a state which "never
existed", for example for a set of values and sequence of ops:

               System state:
  start:       # [A, B]
  add:  C      # [A, C, B]
  del:  B      # [A, C]

we may generate a dump of [A], if C got an index between A and B.
System has never been in such state. But I'm 90% sure that's perfectly
fine, important part is that we can't _miss_ devices which exist before
and after. User space which wants to mirror kernel's state subscribes
to notifications and does periodic dumps so it will know that C exists
from the notification about its creation or from the next dump
(next dump is _guaranteed_ to include C, if it doesn't get removed).

To avoid any perf regressions keep the hash table for now. Most
net namespaces have very few devices and microbenchmarking 1M lookups
on Skylake I get the following results (not counting loopback
to number of devs):

 #devs | hash |  xa  | delta
    2  | 18.3 | 20.1 | + 9.8%
   16  | 18.3 | 20.1 | + 9.5%
   64  | 18.3 | 26.3 | +43.8%
  128  | 20.4 | 26.3 | +28.6%
  256  | 20.0 | 26.4 | +32.1%
 1024  | 26.6 | 26.7 | + 0.2%
 8192  |541.3 | 33.5 | -93.8%

No surprises since the hash table has 256 entries.
The microbenchmark scans indexes in order, if the pattern is more
random xa starts to win at 512 devices already. But that's a lot
of devices, in practice.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'ynl-couple-of-unrelated-fixes'
Jakub Kicinski [Fri, 28 Jul 2023 16:33:14 +0000 (09:33 -0700)]
Merge branch 'ynl-couple-of-unrelated-fixes'

Stanislav Fomichev says:

====================
ynl: couple of unrelated fixes

- spelling of xdp-features
- s/xdp_zc_max_segs/xdp-zc-max-segs/
- expose xdp-zc-max-segs
- add /* private: */
- regenerate headers
- print xdp_zc_max_segs from sample
====================

Link: https://lore.kernel.org/r/20230727163001.3952878-1-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: print xdp-zc-max-segs in the sample
Stanislav Fomichev [Thu, 27 Jul 2023 16:30:01 +0000 (09:30 -0700)]
ynl: print xdp-zc-max-segs in the sample

Technically we don't have to keep extending the sample, but it
feels useful to run these tools locally to confirm everything
is working.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-5-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: regenerate all headers
Stanislav Fomichev [Thu, 27 Jul 2023 16:30:00 +0000 (09:30 -0700)]
ynl: regenerate all headers

Also add support to pass topdir to ynl-regen.sh (Jakub) and call
it from the makefile to update the UAPI headers.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Co-developed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-4-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: mark max/mask as private for kdoc
Stanislav Fomichev [Thu, 27 Jul 2023 16:29:59 +0000 (09:29 -0700)]
ynl: mark max/mask as private for kdoc

Simon mentioned in another thread that it makes kdoc happy
and Jakub confirms that commit e27cb89a22ad ("scripts: kernel-doc: support
private / public marking for enums") actually added the needed
support.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-3-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: expose xdp-zc-max-segs
Stanislav Fomichev [Thu, 27 Jul 2023 16:29:58 +0000 (09:29 -0700)]
ynl: expose xdp-zc-max-segs

Also rename it to dashes, to match the rest. And fix unrelated
spelling error while we're at it.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-2-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
David S. Miller [Fri, 28 Jul 2023 10:03:57 +0000 (11:03 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/nex
t-queue

Tony Nguyen says:

====================
ice: Implement support for SRIOV + LAG

Dave Ertman says:

Implement support for SRIOV VF's on interfaces that are in an
aggregate interface.

The first interface added into the aggregate will be flagged as
the primary interface, and this primary interface will be
responsible for managing the VF's resources.  VF's created on the
primary are the only VFs that will be supported on the aggregate.
Only Active-Backup mode will be supported and only aggregates whose
primary interface is in switchdev mode will be supported.

The ice-lag DDP must be loaded to support this feature.

Additional restrictions on what interfaces can be added to the aggregate
and still support SRIOV VFs are:
- interfaces have to all be on the same physical NIC
- all interfaces have to have the same QoS settings
- interfaces have to have the FW LLDP agent disabled
- only the primary interface is to be put into switchdev mode
- no more than two interfaces in the aggregate
---
v2:
- Move NULL check for q_ctx in ice_lag_qbuf_recfg() earlier (patch 6)

v1: https://lore.kernel.org/netdev/20230726182141.3797928-1-anthony.l.nguyen@intel.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoIPv6: add extack info for IPv6 address add/delete
Hangbin Liu [Wed, 26 Jul 2023 02:39:05 +0000 (10:39 +0800)]
IPv6: add extack info for IPv6 address add/delete

Add extack info for IPv6 address add/delete, which would be useful for
users to understand the problem without having to read kernel code.

Suggested-by: Beniamino Galvani <bgalvani@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'selftest-ptp'
David S. Miller [Fri, 28 Jul 2023 09:59:40 +0000 (10:59 +0100)]
Merge branch 'selftest-ptp'

Alex Maftei says:

====================
selftests/ptp: Add support for new timestamp IOCTLs

PTP_SYS_OFFSET_EXTENDED was added in November 2018 in
361800876f80 (" ptp: add PTP_SYS_OFFSET_EXTENDED ioctl")
and PTP_SYS_OFFSET_PRECISE was added in February 2016 in
719f1aa4a671 ("ptp: Add PTP_SYS_OFFSET_PRECISE for driver crosstimestamping")

The PTP selftest code is lacking support for these two IOCTLS.
This short series of patches adds support for them.

Changes in v2:
- Fixed rebase issues (v1 somehow ended up with patch 1 being from the
  first manual split of my changes and patch 2 being from rebase 2 out
  of 3)
- Rebased on top of net-next
====================

Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests/ptp: Add -X option for testing PTP_SYS_OFFSET_PRECISE
Alex Maftei [Tue, 25 Jul 2023 21:53:34 +0000 (22:53 +0100)]
selftests/ptp: Add -X option for testing PTP_SYS_OFFSET_PRECISE

The -X option was chosen because X looks like a cross, and the underlying
callback is 'get cross timestamp'.

Signed-off-by: Alex Maftei <alex.maftei@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests/ptp: Add -x option for testing PTP_SYS_OFFSET_EXTENDED
Alex Maftei [Tue, 25 Jul 2023 21:53:33 +0000 (22:53 +0100)]
selftests/ptp: Add -x option for testing PTP_SYS_OFFSET_EXTENDED

The -x option (where 'x' stands for eXtended) takes an argument which
represents the number of samples to request from the PTP device.
The help message will display the maximum number of samples allowed.
Providing an invalid argument will also display the maximum number of
samples allowed.

Signed-off-by: Alex Maftei <alex.maftei@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge tag 'linux-can-next-for-6.6-20230728' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Fri, 28 Jul 2023 08:58:53 +0000 (09:58 +0100)]
Merge tag 'linux-can-next-for-6.6-20230728' of git://git./linux/kernel/git/mkl/linux-can-next

linux-can-next-for-6.6-20230728

Marc Kleine-Budde says:

====================
Hello netdev-team,

this is a pull request of 21 patches for net-next/master.

The 1st patch is by Gerhard Uttenthaler, which adds Gerhard as the
maintainer ems_pci driver.

Peter Seiderer's patch removes a unused function from the peak_usb
driver.

The next 4 patches are by John Watts and add support for the sun4i_can
driver on the Allwinner D1.

Rob Herring's patch corrects the DT includes in various CAN drivers.

Followed by 14 patches from me concerning the gs_usb driver. The first
11 are various cleanups consisting of coding style improvements, error
path printout cleanups, and removal of unneeded
usb_kill_anchored_urbs(). The last 3 convert the driver to use NAPI to
avoid out-of-order reception of CAN frames.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'sfc-siena-next'
David S. Miller [Fri, 28 Jul 2023 08:54:18 +0000 (09:54 +0100)]
Merge branch 'sfc-siena-next'

Martin Habets says:

====================
sfc: Remove Siena bits from sfc.ko

Last year we split off Siena into it's own driver under directory siena.
This patch series removes the now unused Falcon and Siena code from sfc.ko.
No functional changes are intended.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove vfdi.h
Martin Habets [Thu, 27 Jul 2023 10:41:37 +0000 (11:41 +0100)]
sfc: Remove vfdi.h

It was only used for Siena SRIOV, so nothing includes it any more
in this directory.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Cleanups in io.h
Martin Habets [Thu, 27 Jul 2023 10:41:32 +0000 (11:41 +0100)]
sfc: Cleanups in io.h

Most of the Falcon locking description does not apply to EF10.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Miscellaneous comment removals
Martin Habets [Thu, 27 Jul 2023 10:41:26 +0000 (11:41 +0100)]
sfc: Miscellaneous comment removals

Remove comments that only apply to Falcon and Siena.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove struct efx_special_buffer
Martin Habets [Thu, 27 Jul 2023 10:41:21 +0000 (11:41 +0100)]
sfc: Remove struct efx_special_buffer

The attributes index and entries are no longer needed, so use
struct efx_buffer instead.
next_buffer_table was also Siena specific.
Removed some checkpatch warnings.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Filter cleanups for Falcon and Siena
Martin Habets [Thu, 27 Jul 2023 10:41:15 +0000 (11:41 +0100)]
sfc: Filter cleanups for Falcon and Siena

unicast_filter and multicast_hash are no longer needed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove some NIC type indirections that are no longer needed
Martin Habets [Thu, 27 Jul 2023 10:41:10 +0000 (11:41 +0100)]
sfc: Remove some NIC type indirections that are no longer needed

The special handling for SRIOV reset and FLR is not needed on EF10.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove PTP code for Siena
Martin Habets [Thu, 27 Jul 2023 10:41:05 +0000 (11:41 +0100)]
sfc: Remove PTP code for Siena

rx_tx_inline is now always true.
The special event list is no longer needed, event handling is always
inline.
Event MCDI_EVENT_CODE_PTP_RX is no longer needed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove EFX_REV_SIENA_A0
Martin Habets [Thu, 27 Jul 2023 10:40:59 +0000 (11:40 +0100)]
sfc: Remove EFX_REV_SIENA_A0

The workarounds for bug 8568 and 17213 are no longer needed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove support for siena high priority queue
Martin Habets [Thu, 27 Jul 2023 10:40:54 +0000 (11:40 +0100)]
sfc: Remove support for siena high priority queue

This also removes TC support code, since that was never supported for EF10.
TC support for EF100 is not handled from efx.c.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove siena_nic_data and stats
Martin Habets [Thu, 27 Jul 2023 10:40:48 +0000 (11:40 +0100)]
sfc: Remove siena_nic_data and stats

These are no longer used, and the two  Siena specific functions are
no longer present in sfc.ko.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agosfc: Remove falcon references
Martin Habets [Thu, 27 Jul 2023 10:40:43 +0000 (11:40 +0100)]
sfc: Remove falcon references

Siena was using functions from the Falcon architecture.
These start with the efx_farch prefix.
Now that both of these are in separate modules the references
are no longer used in the sfc.ko module.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'rxfh-custom-rss'
David S. Miller [Fri, 28 Jul 2023 08:35:54 +0000 (09:35 +0100)]
Merge branch 'rxfh-custom-rss'

Joe Damato says:

====================
rxfh with custom RSS fixes

Greetings:

Welcome to v2, now via net-next. No functional changes; only style
changes (see the summary below).

While attempting to get the RX flow hash key for a custom RSS context on
my mlx5 NIC, I got an error:

$ sudo ethtool -u eth1 rx-flow-hash tcp4 context 1
Cannot get RX network flow hashing options: Invalid argument

I dug into this a bit and noticed two things:

1. ETHTOOL_GRXFH supports custom RSS contexts, but ETHTOOL_SRXFH does
not. I moved the copy logic out of ETHTOOL_GRXFH and into a helper so
that both ETHTOOL_{G,S}RXFH now call it, which fixes ETHTOOL_SRXFH. This
is patch 1/2.

2. mlx5 defaulted to RSS context 0 for both ETHTOOL_{G,S}RXFH paths. I
have modified the driver to support custom contexts for both paths. It
is now possible to get and set the flow hash key for custom RSS contexts
with mlx5. This is patch 2/2.

See commit messages for more details.

Thanks.

v2:
- Rebased on net-next
- Adjusted arguments of mlx5e_rx_res_rss_get_hash_fields and
  mlx5e_rx_res_rss_set_hash_fields to move rss_idx next to the rss
  argument
- Changed return value of both mlx5e_rx_res_rss_get_hash_fields and
  mlx5e_rx_res_rss_set_hash_fields to -ENOENT when the rss entry is
  NULL
- Changed order of local variables in mlx5e_get_rss_hash_opt and
  mlx5e_set_rss_hash_opt
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agonet/mlx5: Fix flowhash key set/get for custom RSS
Joe Damato [Tue, 25 Jul 2023 20:56:55 +0000 (20:56 +0000)]
net/mlx5: Fix flowhash key set/get for custom RSS

mlx5 flow hash field retrieval and set only worked on the default
RSS context as of commit f01cc58c18d6 ("net/mlx5e: Support multiple RSS
contexts"), not custom RSS contexts.

For example, before this patch attempting to retrieve the flow hash fields
for RSS context 1 fails:

$ sudo ethtool -u eth1 rx-flow-hash tcp4 context 1
Cannot get RX network flow hashing options: Invalid argument

This patch fixes getting and setting the flow hash fields for contexts
other than the default context.

Getting the flow hash fields for RSS context 1:

$ sudo ethtool -u eth1 rx-flow-hash tcp4 context 1
For RSS context 1:
TCP over IPV4 flows use these fields for computing Hash flow key:
IP DA

Now, setting the flash hash fields to a custom value:

$ sudo ethtool -U eth1 rx-flow-hash tcp4 sdfn context 1

And retrieving them again:

$ sudo ethtool -u eth1 rx-flow-hash tcp4 context 1
For RSS context 1:
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agonet: ethtool: Unify ETHTOOL_{G,S}RXFH rxnfc copy
Joe Damato [Tue, 25 Jul 2023 20:56:54 +0000 (20:56 +0000)]
net: ethtool: Unify ETHTOOL_{G,S}RXFH rxnfc copy

ETHTOOL_GRXFH correctly copies in the full struct ethtool_rxnfc when
FLOW_RSS is set; ETHTOOL_SRXFH needs a similar code path to handle the
FLOW_RSS case so that ethtool can set the flow hash for custom RSS
contexts (if supported by the driver).

The copy code from ETHTOOL_GRXFH has been pulled out in to a helper so
that it can be called in both ETHTOOL_{G,S}RXFH code paths.

Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge patch series "can: gs_usb: convert to NAPI"
Marc Kleine-Budde [Fri, 28 Jul 2023 07:46:15 +0000 (09:46 +0200)]
Merge patch series "can: gs_usb: convert to NAPI"

Marc Kleine-Budde <mkl@pengutronix.de> says:

Traditionally USB drivers used netif_rx to pass the received CAN
frames/skbs to the network stack. In netif_rx() the skbs are queued to
the local CPU. If IRQs are handled in round robin, CAN frames may be
delivered out-of-order to user space.

To support devices without timestamping the TX path of the rx-offload
helper is cleaned up and extended:
- rename rx_offload_get_echo_skb() ->
  can_rx_offload_get_echo_skb_queue_timestamp()
- add can_rx_offload_get_echo_skb_queue_tail()

The last patch converts the gs_usb driver to NAPI with the rx-offload
helper.

Link: https://lore.kernel.org/all/20230718-gs_usb-rx-offload-v2-0-716e542d14d5@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: convert to NAPI/rx-offload to avoid OoO reception
Marc Kleine-Budde [Fri, 2 Jun 2023 07:29:07 +0000 (09:29 +0200)]
can: gs_usb: convert to NAPI/rx-offload to avoid OoO reception

The driver used to pass received CAN frames/skbs to the network stack
with netif_rx(). In netif_rx() the skbs are queued to the local CPU.
If IRQs are handled in round robin, OoO packets may occur.

To avoid out-of-order reception convert the driver from netif_rx() to
NAPI.

For USB devices with timestamping support use the rx-offload helper
can_rx_offload_queue_timestamp() for the RX, and
can_rx_offload_get_echo_skb_queue_timestamp() for the TX path. Devices
without timestamping support use can_rx_offload_queue_tail() for RX,
and can_rx_offload_get_echo_skb_queue_tail() for the TX path.

Link: https://lore.kernel.org/all/559D628C.5020100@hartkopp.net
Link: https://github.com/candle-usb/candleLight_fw/issues/166
Link: https://lore.kernel.org/all/20230718-gs_usb-rx-offload-v2-3-716e542d14d5@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: rx-offload: add can_rx_offload_get_echo_skb_queue_tail()
Marc Kleine-Budde [Mon, 3 Jul 2023 16:18:19 +0000 (18:18 +0200)]
can: rx-offload: add can_rx_offload_get_echo_skb_queue_tail()

Add can_rx_offload_get_echo_skb_queue_tail(). This function addds the
echo skb at the end of rx-offload the queue. This is intended for
devices without timestamp support.

Link: https://lore.kernel.org/all/20230718-gs_usb-rx-offload-v2-2-716e542d14d5@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: rx-offload: rename rx_offload_get_echo_skb() -> can_rx_offload_get_echo_skb_queu...
Marc Kleine-Budde [Fri, 2 Jun 2023 07:36:38 +0000 (09:36 +0200)]
can: rx-offload: rename rx_offload_get_echo_skb() -> can_rx_offload_get_echo_skb_queue_timestamp()

Rename the rx_offload_get_echo_skb() function to
can_rx_offload_get_echo_skb_queue_timestamp(), since it inserts the
echo skb into the rx-offload queue sorted by timestamp.

This is a preparation for adding
can_rx_offload_get_echo_skb_queue_tail(), which adds the echo skb to
the end of the queue. This is intended for devices that do not support
timestamps.

Link: https://lore.kernel.org/all/20230718-gs_usb-rx-offload-v2-1-716e542d14d5@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agoMerge patch series "can: gs_usb-cleanups: various clenaups"
Marc Kleine-Budde [Fri, 28 Jul 2023 07:44:51 +0000 (09:44 +0200)]
Merge patch series "can: gs_usb-cleanups: various clenaups"

Marc Kleine-Budde <mkl@pengutronix.de> says:

This is a cleanup series of the gs_usb driver. Align the driver more
to the kernel coding style, make use of locally defined variables,
clean up printouts in various error paths, remove some not needed
usb_kill_anchored_urbs() from the shut down paths.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-0-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_disconnect(): remove not needed usb_kill_anchored_urbs()
Marc Kleine-Budde [Thu, 6 Jul 2023 09:38:25 +0000 (11:38 +0200)]
can: gs_usb: gs_usb_disconnect(): remove not needed usb_kill_anchored_urbs()

In gs_usb_disconnect(), all channels are destroyed first, then all
anchored RX URBs (parent->rx_submitted) are disposed with
usb_kill_anchored_urbs().

The call to usb_kill_anchored_urbs() is not needed, as
gs_destroy_candev() of the last active channel already disposes the RX
URBS.

Remove not needed call to usb_kill_anchored_urbs() from
gs_usb_disconnect().

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-11-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_destroy_candev(): remove not needed usb_kill_anchored_urbs()
Marc Kleine-Budde [Thu, 6 Jul 2023 09:38:25 +0000 (11:38 +0200)]
can: gs_usb: gs_destroy_candev(): remove not needed usb_kill_anchored_urbs()

In gs_destroy_candev(), the netdev is unregistered first, then all
anchored TX URBs (dev->tx_submitted) are disposed with
usb_kill_anchored_urbs().

The call to usb_kill_anchored_urbs() is not needed, as
unregister_candev() calls gs_can_close(), which already disposes the
TX URBS.

Remove not needed call to usb_kill_anchored_urbs() from
gs_destroy_candev().

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-10-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_can_close(): don't complain about failed device reset during ndo_stop
Marc Kleine-Budde [Wed, 8 Feb 2023 09:26:33 +0000 (10:26 +0100)]
can: gs_usb: gs_can_close(): don't complain about failed device reset during ndo_stop

When the USB device is unplugged, gs_can_close() (which implements the
struct net_device_ops::ndo_stop callback) is called. In this function
an attempt is made to shut down the USB device with a USB control
message. For disconnected devices this will fail and a warning message
is printed.

Silence the driver by removing the printout of the error message if
the reset command fails.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-9-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_can_start_xmit(), gs_can_open(): clean up printouts in error path
Marc Kleine-Budde [Wed, 12 Jul 2023 13:49:43 +0000 (15:49 +0200)]
can: gs_usb: gs_can_start_xmit(), gs_can_open(): clean up printouts in error path

Remove unnecessary "out of memory" message from the error path of
gs_can_start_xmit() and gs_can_open().

Convert the printout in case of a failing usb_submit_urb() in
gs_can_open() from numbers to human readable error codes.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-8-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case...
Marc Kleine-Budde [Tue, 4 Jul 2023 09:23:37 +0000 (11:23 +0200)]
can: gs_usb: gs_usb_receive_bulk_callback(): count RX overflow errors also in case of OOM

In case of an RX overflow error from the CAN controller and an OOM
where no skb can be allocated, the error counters are not incremented.

Fix this by first incrementing the error counters and then allocate
the skb.

Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-7-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_receive_bulk_callback(): make use of stats
Marc Kleine-Budde [Mon, 3 Jul 2023 18:39:19 +0000 (20:39 +0200)]
can: gs_usb: gs_usb_receive_bulk_callback(): make use of stats

Make use the previously assigned variable stats instead of using
netdev->stats.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-6-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_receive_bulk_callback(): make use of netdev
Marc Kleine-Budde [Mon, 3 Jul 2023 18:37:20 +0000 (20:37 +0200)]
can: gs_usb: gs_usb_receive_bulk_callback(): make use of netdev

Make use the previously assigned variable netdev instead of using
dev->netdev.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-5-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: uniformly use "parent" as variable name for struct gs_usb
Marc Kleine-Budde [Thu, 6 Jul 2023 14:16:33 +0000 (16:16 +0200)]
can: gs_usb: uniformly use "parent" as variable name for struct gs_usb

To ease readability and maintainability uniformly use the variable
name "parent" for the struct gs_usb in the gs_usb driver.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-4-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_set_timestamp(): remove return statements form void function
Marc Kleine-Budde [Thu, 6 Jul 2023 11:04:45 +0000 (13:04 +0200)]
can: gs_usb: gs_usb_set_timestamp(): remove return statements form void function

Remove the return statements from void gs_usb_set_timestamp()
function, as it's not generally useful.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-3-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: gs_usb_probe(): align block comment
Marc Kleine-Budde [Thu, 6 Jul 2023 11:08:43 +0000 (13:08 +0200)]
can: gs_usb: gs_usb_probe(): align block comment

Indent block comment so that it aligns the * on each line.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-2-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: gs_usb: remove leading space from goto labels
Marc Kleine-Budde [Tue, 18 Jul 2023 09:11:22 +0000 (11:11 +0200)]
can: gs_usb: remove leading space from goto labels

Remove leading spaces from goto labels in accordance with the kernel
encoding style.

Link: https://lore.kernel.org/all/20230718-gs_usb-cleanups-v1-1-c3b9154ec605@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: Explicitly include correct DT includes, part 2
Rob Herring [Mon, 24 Jul 2023 21:18:40 +0000 (15:18 -0600)]
can: Explicitly include correct DT includes, part 2

The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.

Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/all/20230724211841.805053-1-robh@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agoMerge patch series "Add support for Allwinner D1 CAN controllers"
Marc Kleine-Budde [Fri, 28 Jul 2023 06:49:32 +0000 (08:49 +0200)]
Merge patch series "Add support for Allwinner D1 CAN controllers"

John Watts <contact@jookia.org> says:

This patch series adds support for the Allwinner D1 CAN controllers.
It requires adding a new device tree compatible and driver support to
work around some hardware quirks.

This has been tested on the Mango Pi MQ Dual running a T113 and a
Lichee Panel 86 running a D1.

Changes in v2:
- Re-ordered patches to work with bisecting
- Fixed device tree label underscores
- Fixed email headers
- Link to v1: https://lore.kernel.org/all/20230715112523.2533742-1-contact@jookia.org

Link: https://lore.kernel.org/all/20230721221552.1973203-2-contact@jookia.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: peak_usb: remove unused/legacy peak_usb_netif_rx() function
Peter Seiderer [Fri, 21 Jul 2023 18:07:58 +0000 (20:07 +0200)]
can: peak_usb: remove unused/legacy peak_usb_netif_rx() function

Remove unused/legacy peak_usb_netif_rx() function (not longer used
since commit 28e0a70cede3 ("can: peak_usb: CANFD: store 64-bits hw
timestamps").

Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Link: https://lore.kernel.org/all/20230721180758.26199-1-ps.report@gmx.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: sun4i_can: Add support for the Allwinner D1
John Watts [Fri, 21 Jul 2023 22:15:53 +0000 (08:15 +1000)]
can: sun4i_can: Add support for the Allwinner D1

The controllers present in the D1 are extremely similar to the R40
and require the same reset quirks, but An extra quirk is needed to support
receiving packets.

Signed-off-by: John Watts <contact@jookia.org>
Link: https://lore.kernel.org/all/20230721221552.1973203-6-contact@jookia.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agoMAINTAINERS: Add myself as maintainer of the ems_pci.c driver
Gerhard Uttenthaler [Thu, 20 Jul 2023 14:40:32 +0000 (16:40 +0200)]
MAINTAINERS: Add myself as maintainer of the ems_pci.c driver

At the suggestion of Marc Kleine-Budde [1], I add myself as maintainer
of the ems_pci.c driver.

[1] https://lore.kernel.org/all/20230720-purplish-quizzical-247024e66671-mkl@pengutronix.de

Signed-off-by: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com>
Link: https://lore.kernel.org/all/20230720144032.28960-1-uttenthaler@ems-wuensche.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agocan: sun4i_can: Add acceptance register quirk
John Watts [Fri, 21 Jul 2023 22:15:52 +0000 (08:15 +1000)]
can: sun4i_can: Add acceptance register quirk

The Allwinner D1's CAN controllers have the ACPC and ACPM registers
moved down. Compensate for this by adding an offset quirk for the
acceptance registers.

Signed-off-by: John Watts <contact@jookia.org>
Link: https://lore.kernel.org/all/20230721221552.1973203-5-contact@jookia.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agoriscv: dts: allwinner: d1: Add CAN controller nodes
John Watts [Fri, 21 Jul 2023 22:15:51 +0000 (08:15 +1000)]
riscv: dts: allwinner: d1: Add CAN controller nodes

The Allwinner D1, T113 provide two CAN controllers that are variants
of the R40 controller.

I have tested support for these controllers on two boards:

- A Lichee Panel RV 86 Panel running a D1 chip
- A Mango Pi MQ Dual running a T113-s3 chip

Both of these fully support both CAN controllers.

Signed-off-by: John Watts <contact@jookia.org>
Link: https://lore.kernel.org/all/20230721221552.1973203-4-contact@jookia.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agodt-bindings: net: can: Add support for Allwinner D1 CAN controller
John Watts [Fri, 21 Jul 2023 22:15:50 +0000 (08:15 +1000)]
dt-bindings: net: can: Add support for Allwinner D1 CAN controller

The Allwinner D1 has two CAN controllers, both a variant of the R40
controller. Unfortunately the registers for the D1 controllers are
moved around enough to be incompatible and require a new compatible.

Introduce the "allwinner,sun20i-d1-can" compatible to support this.

Signed-off-by: John Watts <contact@jookia.org>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/20230721221552.1973203-3-contact@jookia.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
11 months agonet: Explicitly include correct DT includes
Rob Herring [Thu, 27 Jul 2023 01:49:39 +0000 (19:49 -0600)]
net: Explicitly include correct DT includes

The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.

Acked-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230727014944.3972546-1-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoRevert "net: stmmac: correct MAC propagation delay"
Jakub Kicinski [Wed, 26 Jul 2023 22:40:54 +0000 (15:40 -0700)]
Revert "net: stmmac: correct MAC propagation delay"

This reverts commit 20bf98c94146eb6fe62177817cb32f53e72dd2e8.

Richard raised concerns about correctness of the code on previous
generations of the HW.

Fixes: 20bf98c94146 ("net: stmmac: correct MAC propagation delay")
Link: https://lore.kernel.org/all/ZMGIuKVP7BEotbrn@hoboy.vegasvil.org/
Link: https://lore.kernel.org/r/20230726224054.3241127-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'net-stmmac-increase-clk_ptp_ref-rate'
Jakub Kicinski [Fri, 28 Jul 2023 03:32:59 +0000 (20:32 -0700)]
Merge branch 'net-stmmac-increase-clk_ptp_ref-rate'

Andrew Halaney says:

====================
net: stmmac: Increase clk_ptp_ref rate

This series aims to increase the clk_ptp_ref rate to get the best
possible PTP timestamping resolution possible. Some modified disclosure
about my development/testing process from the RFC/RFT v1 follows.

Disclosure: I don't know much about PTP beyond what you can google in an
afternoon, don't have access to documentation about the stmmac IP,
and have only tested that (based on code comments and git commit
history) the programming of the subsecond register (and the clock rate)
makes more sense with these changes. Qualcomm has tested a similar
change offlist, verifying PTP more formally as I understand it.

The last version was an RFC/RFT, but I didn't get a lot of confirmation
that doing patch 3 in that series (essentially setting clk_ptp_ref to
whatever its max value is) for the whole stmmac ecosystem was a safe
idea. So I am erring on the side of caution and doing this for the
Qualcomm platform only. See v1 for an approach that would apply to
all stmmac platform drivers with clk_ptp_ref.

v1: https://lore.kernel.org/netdev/20230711205732.364954-1-ahalaney@redhat.com/
====================

Link: https://lore.kernel.org/r/20230725211853.895832-2-ahalaney@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: stmmac: dwmac-qcom-ethqos: Use max frequency for clk_ptp_ref
Andrew Halaney [Tue, 25 Jul 2023 21:04:26 +0000 (16:04 -0500)]
net: stmmac: dwmac-qcom-ethqos: Use max frequency for clk_ptp_ref

Qualcomm clocks can set their frequency to a variety of levels
generally. Let's use the max for clk_ptp_ref to ensure the best
timestamping resolution possible.

Without this, the default value of the clock is used. For sa8775p-ride
this is 19.2 MHz, far less than the 230.4 MHz possible.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/20230725211853.895832-4-ahalaney@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: stmmac: Make ptp_clk_freq_config variable type explicit
Andrew Halaney [Tue, 25 Jul 2023 21:04:25 +0000 (16:04 -0500)]
net: stmmac: Make ptp_clk_freq_config variable type explicit

The priv variable is _always_ of type (struct stmmac_priv *), so let's
stop using (void *) since it isn't abstracting anything.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/20230725211853.895832-3-ahalaney@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge tag 'nf-next-23-07-27' of https://git.kernel.org/pub/scm/linux/kernel/git/netfi...
Jakub Kicinski [Fri, 28 Jul 2023 03:25:43 +0000 (20:25 -0700)]
Merge tag 'nf-next-23-07-27' of https://git./linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter updates for net-next

1.  silence a harmless warning for CONFIG_NF_CONNTRACK_PROCFS=n builds,
 from Zhu Wang.

2, 3:
Allow NLA_POLICY_MASK to be used with BE16/BE32 types, and replace a few
manual checks with nla_policy based one in nf_tables, from myself.

4: cleanup in ctnetlink to validate while parsing rather than
   using two steps, from Lin Ma.

5: refactor boyer-moore textsearch by moving a small chunk to
   a helper function, rom Jeremy Sowden.

* tag 'nf-next-23-07-27' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  lib/ts_bm: add helper to reduce indentation and improve readability
  netfilter: conntrack: validate cta_ip via parsing
  netfilter: nf_tables: use NLA_POLICY_MASK to test for valid flag options
  netlink: allow be16 and be32 types in all uint policy checks
  nf_conntrack: fix -Wunused-const-variable=
====================

Link: https://lore.kernel.org/r/20230727133604.8275-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'net-tls-fixes-for-nvme-over-tls'
Jakub Kicinski [Fri, 28 Jul 2023 02:49:38 +0000 (19:49 -0700)]
Merge branch 'net-tls-fixes-for-nvme-over-tls'

Hannes Reinecke says:

====================
net/tls: fixes for NVMe-over-TLS

here are some small fixes to get NVMe-over-TLS up and running.
The first set are just minor modifications to have MSG_EOR handled
for TLS, but the second set implements the ->read_sock() callback
for tls_sw.
The ->read_sock() callbacks return -EIO when encountering any TLS
Alert message, but as that's the default behaviour anyway I guess
we can get away with it.
====================

Applied on top of the tag in case Sagi gets convinced to pull it.

Link: https://lore.kernel.org/r/20230726191556.41714-1-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: implement ->read_sock()
Hannes Reinecke [Wed, 26 Jul 2023 19:15:56 +0000 (21:15 +0200)]
net/tls: implement ->read_sock()

Implement ->read_sock() function for use with nvme-tcp.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Boris Pismenny <boris.pismenny@gmail.com>
Link: https://lore.kernel.org/r/20230726191556.41714-7-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: split tls_rx_reader_lock
Hannes Reinecke [Wed, 26 Jul 2023 19:15:55 +0000 (21:15 +0200)]
net/tls: split tls_rx_reader_lock

Split tls_rx_reader_{lock,unlock} into an 'acquire/release' and
the actual locking part.
With that we can use the tls_rx_reader_lock in situations where
the socket is already locked.

Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-6-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: Use tcp_read_sock() instead of ops->read_sock()
Hannes Reinecke [Wed, 26 Jul 2023 19:15:54 +0000 (21:15 +0200)]
net/tls: Use tcp_read_sock() instead of ops->read_sock()

TLS resets the protocol operations, so the read_sock() callback might
be changed, too.
In this case using sock->ops->readsock() in tls_strp_read_copyin() will
enter an infinite recursion if the read_sock() callback is calling
tls_rx_rec_wait() which will call into sock->ops->readsock() via
tls_strp_read_copyin().
But as tls_strp_read_copyin() is supposed to produce data from the
consumed socket and that socket is always a TCP socket we can call
tcp_read_sock() directly without having to deal with callbacks.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-5-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests/net/tls: add test for MSG_EOR
Hannes Reinecke [Wed, 26 Jul 2023 19:15:53 +0000 (21:15 +0200)]
selftests/net/tls: add test for MSG_EOR

As the recent patch is modifying the behaviour for TLS re MSG_EOR
handling we should be having a test for it.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-4-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: handle MSG_EOR for tls_device TX flow
Hannes Reinecke [Wed, 26 Jul 2023 19:15:52 +0000 (21:15 +0200)]
net/tls: handle MSG_EOR for tls_device TX flow

tls_push_data() MSG_MORE, but bails out on MSG_EOR.
Seeing that MSG_EOR is basically the opposite of MSG_MORE
this patch adds handling MSG_EOR by treating it as the
absence of MSG_MORE.
Consequently we should return an error when both are set.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-3-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: handle MSG_EOR for tls_sw TX flow
Hannes Reinecke [Wed, 26 Jul 2023 19:15:51 +0000 (21:15 +0200)]
net/tls: handle MSG_EOR for tls_sw TX flow

tls_sw_sendmsg() already handles MSG_MORE, but bails
out on MSG_EOR.
Seeing that MSG_EOR is basically the opposite of
MSG_MORE this patch adds handling MSG_EOR by treating
it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with MSG_MORE.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-2-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: datalink: Remove unused declarations
YueHaibing [Wed, 26 Jul 2023 14:40:54 +0000 (22:40 +0800)]
net: datalink: Remove unused declarations

These declarations is not used after ipx protocol removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230726144054.28780-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: Remove unused declaration dev_restart()
YueHaibing [Wed, 26 Jul 2023 14:37:15 +0000 (22:37 +0800)]
net: Remove unused declaration dev_restart()

This is not used, so can remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230726143715.24700-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agodccp: Remove unused declaration dccp_feat_initialise_sysctls()
YueHaibing [Wed, 26 Jul 2023 14:32:39 +0000 (22:32 +0800)]
dccp: Remove unused declaration dccp_feat_initialise_sysctls()

This is never used, so can remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230726143239.9904-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agobridge: Remove unused declaration br_multicast_set_hash_max()
YueHaibing [Wed, 26 Jul 2023 14:31:41 +0000 (22:31 +0800)]
bridge: Remove unused declaration br_multicast_set_hash_max()

Since commit 19e3a9c90c53 ("net: bridge: convert multicast to generic rhashtable")
this is not used, so can remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230726143141.11704-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: remove comment in ndisc_router_discovery
Patrick Rohr [Wed, 26 Jul 2023 18:47:42 +0000 (11:47 -0700)]
net: remove comment in ndisc_router_discovery

Removes superfluous (and misplaced) comment from ndisc_router_discovery.

Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726184742.342825-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 27 Jul 2023 22:21:46 +0000 (15:21 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes after downstream PR.

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge tag 'net-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 27 Jul 2023 19:27:37 +0000 (12:27 -0700)]
Merge tag 'net-6.5-rc4' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, netfilter.

  Current release - regressions:

   - core: fix splice_to_socket() for O_NONBLOCK socket

   - af_unix: fix fortify_panic() in unix_bind_bsd().

   - can: raw: fix lockdep issue in raw_release()

  Previous releases - regressions:

   - tcp: reduce chance of collisions in inet6_hashfn().

   - netfilter: skip immediate deactivate in _PREPARE_ERROR

   - tipc: stop tipc crypto on failure in tipc_node_create

   - eth: igc: fix kernel panic during ndo_tx_timeout callback

   - eth: iavf: fix potential deadlock on allocation failure

  Previous releases - always broken:

   - ipv6: fix bug where deleting a mngtmpaddr can create a new
     temporary address

   - eth: ice: fix memory management in ice_ethtool_fdir.c

   - eth: hns3: fix the imp capability bit cannot exceed 32 bits issue

   - eth: vxlan: calculate correct header length for GPE

   - eth: stmmac: apply redundant write work around on 4.xx too"

* tag 'net-6.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  tipc: stop tipc crypto on failure in tipc_node_create
  af_unix: Terminate sun_path when bind()ing pathname socket.
  tipc: check return value of pskb_trim()
  benet: fix return value check in be_lancer_xmit_workarounds()
  virtio-net: fix race between set queues and probe
  net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64
  splice, net: Fix splice_to_socket() for O_NONBLOCK socket
  net: fec: tx processing does not call XDP APIs if budget is 0
  mptcp: more accurate NL event generation
  selftests: mptcp: join: only check for ip6tables if needed
  tools: ynl-gen: fix parse multi-attr enum attribute
  tools: ynl-gen: fix enum index in _decode_enum(..)
  netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
  netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
  netfilter: nft_set_rbtree: fix overlap expiration walk
  igc: Fix Kernel Panic during ndo_tx_timeout callback
  net: dsa: qca8k: fix mdb add/del case with 0 VID
  net: dsa: qca8k: fix broken search_and_del
  net: dsa: qca8k: fix search_and_insert wrong handling of new rule
  net: dsa: qca8k: enable use_single_write for qca8xxx
  ...

11 months agoMerge tag 'soundwire-6.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 27 Jul 2023 19:07:41 +0000 (12:07 -0700)]
Merge tag 'soundwire-6.5-fixes' of git://git./linux/kernel/git/vkoul/soundwire

Pull soundwire fixes from Vinod Koul:

 - Core fix for enumeration completion

 - Qualcomm driver fix to update status

 - AMD driver fix for probe error check

* tag 'soundwire-6.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: amd: Fix a check for errors in probe()
  soundwire: qcom: update status correctly with mask
  soundwire: fix enumeration completion

11 months agoMerge tag 'phy-fixes-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Linus Torvalds [Thu, 27 Jul 2023 18:52:41 +0000 (11:52 -0700)]
Merge tag 'phy-fixes-6.5' of git://git./linux/kernel/git/phy/linux-phy

Pull phy fixes from Vinod Koul:

 - Out of bound fix for hisilicon phy

 - Qualcomm synopsis femto phy for keeping clock enabled during suspend
   and enabling ref clocks

 - Mediatek driver fixes for upper limit test and error code

* tag 'phy-fixes-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
  phy: hisilicon: Fix an out of bounds check in hisi_inno_phy_probe()
  phy: qcom-snps-femto-v2: use qcom_snps_hsphy_suspend/resume error code
  phy: qcom-snps-femto-v2: properly enable ref clock
  phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend
  phy: mediatek: hdmi: mt8195: fix prediv bad upper limit test
  phy: phy-mtk-dp: Fix an error code in probe()

11 months agoMerge tag 'for-6.5-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Thu, 27 Jul 2023 18:44:08 +0000 (11:44 -0700)]
Merge tag 'for-6.5-rc3-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix accounting of global block reserve size when block group tree is
   enabled

 - the async discard has been enabled in 6.2 unconditionally, but for
   zoned mode it does not make that much sense to do it asynchronously
   as the zones are reset as needed

 - error handling and proper error value propagation fixes

* tag 'for-6.5-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: check for commit error at btrfs_attach_transaction_barrier()
  btrfs: check if the transaction was aborted at btrfs_wait_for_commit()
  btrfs: remove BUG_ON()'s in add_new_free_space()
  btrfs: account block group tree when calculating global reserve size
  btrfs: zoned: do not enable async discard

11 months agoMerge tag 'fixes-2023-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt...
Linus Torvalds [Thu, 27 Jul 2023 18:37:34 +0000 (11:37 -0700)]
Merge tag 'fixes-2023-07-27' of git://git./linux/kernel/git/rppt/memblock

Pull memblock fix from Mike Rapoport:
 "A call to memblock_free() or memblock_phys_free() issued after
  memblock data is discarded will result in use after free in
  memblock_isolate_range().

  Avoid those issues by making sure that memblock_discard points
  memblock.reserved.regions back at the static buffer"

* tag 'fixes-2023-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  mm,memblock: reset memblock.reserved to system init state to prevent UAF

11 months agomm: lock_vma_under_rcu() must check vma->anon_vma under vma lock
Jann Horn [Wed, 26 Jul 2023 21:41:03 +0000 (23:41 +0200)]
mm: lock_vma_under_rcu() must check vma->anon_vma under vma lock

lock_vma_under_rcu() tries to guarantee that __anon_vma_prepare() can't
be called in the VMA-locked page fault path by ensuring that
vma->anon_vma is set.

However, this check happens before the VMA is locked, which means a
concurrent move_vma() can concurrently call unlink_anon_vmas(), which
disassociates the VMA's anon_vma.

This means we can get UAF in the following scenario:

  THREAD 1                   THREAD 2
  ========                   ========
  <page fault>
    lock_vma_under_rcu()
      rcu_read_lock()
      mas_walk()
      check vma->anon_vma

                             mremap() syscall
                               move_vma()
                                vma_start_write()
                                 unlink_anon_vmas()
                             <syscall end>

    handle_mm_fault()
      __handle_mm_fault()
        handle_pte_fault()
          do_pte_missing()
            do_anonymous_page()
              anon_vma_prepare()
                __anon_vma_prepare()
                  find_mergeable_anon_vma()
                    mas_walk() [looks up VMA X]

                             munmap() syscall (deletes VMA X)

                    reusable_anon_vma() [called on freed VMA X]

This is a security bug if you can hit it, although an attacker would
have to win two races at once where the first race window is only a few
instructions wide.

This patch is based on some previous discussion with Linus Torvalds on
the security list.

Cc: stable@vger.kernel.org
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 months agoice: update reset path for SRIOV LAG support
Dave Ertman [Tue, 20 Jun 2023 22:18:54 +0000 (15:18 -0700)]
ice: update reset path for SRIOV LAG support

Add code to rebuild the LAG resources when rebuilding the state of the
interface after a reset.

Also added in a function for building per-queue information into the buffer
used to configure VF queues for LAG fail-over.  This improves code reuse.

Due to differences in timing per interface for recovering from a reset, add
in the ability to retry on non-local dependencies where needed.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: enforce no DCB config changing when in bond
Dave Ertman [Tue, 20 Jun 2023 22:18:53 +0000 (15:18 -0700)]
ice: enforce no DCB config changing when in bond

To support SRIOV LAG, the driver cannot allow changes to an interface's DCB
configuration when in a bond.  This would break the ability to modify
interfaces Tx scheduling for fail-over interfaces.

Block kernel generated DCB config events when in a bond.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: enforce interface eligibility and add messaging for SRIOV LAG
Dave Ertman [Tue, 20 Jun 2023 22:18:52 +0000 (15:18 -0700)]
ice: enforce interface eligibility and add messaging for SRIOV LAG

Implement checks on what interfaces are eligible for supporting SRIOV VFs
when a member of an aggregate interface.

Implement unwind path for interfaces that become ineligible.

checks for the SRIOV LAG feature bit wrap most of the functional code for
manipulating resources that apply to this feature.  Utilize this bit
to track compliant aggregates.  Also flag any new entries into the
aggregate as not supporting SRIOV LAG for the time they are in the
non-compliant aggregate.

Once an aggregate has been flagged as non-compliant, only unpopulating the
aggregate and re-populating it will return SRIOV LAG functionality.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: support non-standard teardown of bond interface
Dave Ertman [Tue, 20 Jun 2023 22:18:51 +0000 (15:18 -0700)]
ice: support non-standard teardown of bond interface

Code for supporting removal of the PF driver (NETDEV_UNREGISTER) events for
both when the bond has the primary interface as active and when failed over
to thew secondary interface.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: Flesh out implementation of support for SRIOV on bonded interface
Dave Ertman [Tue, 20 Jun 2023 22:18:50 +0000 (15:18 -0700)]
ice: Flesh out implementation of support for SRIOV on bonded interface

Add in the functions that will allow a VF created on the primary interface
of a bond to "fail-over" to another PF interface in the bond and continue
to Tx and Rx.

Add in an ordered take-down path for the bonded interface.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: process events created by lag netdev event handler
Dave Ertman [Tue, 20 Jun 2023 22:18:49 +0000 (15:18 -0700)]
ice: process events created by lag netdev event handler

Add in the function framework for the processing of LAG events.  Also add
in helper function to perform common tasks.

Add the basis of the process of linking a lower netdev to an upper netdev.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: implement lag netdev event handler
Dave Ertman [Tue, 20 Jun 2023 22:18:48 +0000 (15:18 -0700)]
ice: implement lag netdev event handler

The event handler for LAG will create a work item to place on the ordered
workqueue to be processed.

Add in defines for training packets and new recipes to be used by the
switching block of the HW for LAG packet steering.

Update the ice_lag struct to reflect the new processing methodology.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: changes to the interface with the HW and FW for SRIOV_VF+LAG
Dave Ertman [Tue, 20 Jun 2023 22:18:47 +0000 (15:18 -0700)]
ice: changes to the interface with the HW and FW for SRIOV_VF+LAG

Add defines needed for interaction with the FW admin queue interface
in relation to supporting LAG and SRIOV VFs interacting.

Add code, or make non-static previously static functions, to access
the new and changed admin queue calls for LAG.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: Add driver support for firmware changes for LAG
Dave Ertman [Tue, 20 Jun 2023 22:18:46 +0000 (15:18 -0700)]
ice: Add driver support for firmware changes for LAG

Add the defines, fields, and detection code for FW support of LAG for
SRIOV.  Also exposes some previously static functions to allow access
in the lag code.

Clean up code that is unused or not needed for LAG support.  Also add
an ordered workqueue for processing LAG events.

Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoice: Correctly initialize queue context values
Jacob Keller [Tue, 20 Jun 2023 22:18:45 +0000 (15:18 -0700)]
ice: Correctly initialize queue context values

The ice_alloc_lan_q_ctx function allocates the queue context array for a
given traffic class. This function uses devm_kcalloc which will
zero-allocate the structure. Thus, prior to any queue being setup by
ice_ena_vsi_txq, the q_ctx structure will have a q_handle of 0 and a q_teid
of 0. These are potentially valid values.

Modify the ice_alloc_lan_q_ctx function to initialize every member of the
q_ctx array to have invalid values. Modify ice_dis_vsi_txq to ensure that
it assigns q_teid to an invalid value when it assigns q_handle to the
invalid value as well.

This will allow other code to check whether the queue context is currently
valid before operating on it.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
11 months agoMerge branch 'virtio-vsock-some-updates-for-msg_peek-flag'
Paolo Abeni [Thu, 27 Jul 2023 13:51:50 +0000 (15:51 +0200)]
Merge branch 'virtio-vsock-some-updates-for-msg_peek-flag'

Arseniy Krasnov says:

====================
virtio/vsock: some updates for MSG_PEEK flag

This patchset does several things around MSG_PEEK flag support. In
general words it reworks MSG_PEEK test and adds support for this flag
in SOCK_SEQPACKET logic. Here is per-patch description:

1) This is cosmetic change for SOCK_STREAM implementation of MSG_PEEK:
   1) I think there is no need of "safe" mode walk here as there is no
      "unlink" of skbs inside loop (it is MSG_PEEK mode - we don't change
      queue).
   2) Nested while loop is removed: in case of MSG_PEEK we just walk
      over skbs and copy data from each one. I guess this nested loop
      even didn't behave as loop - it always executed just for single
      iteration.

2) This adds MSG_PEEK support for SOCK_SEQPACKET. It could be implemented
   be reworking MSG_PEEK callback for SOCK_STREAM to support SOCK_SEQPACKET
   also, but I think it will be more simple and clear from potential
   bugs to implemented it as separate function thus not mixing logics
   for both types of socket. So I've added it as dedicated function.

3) This is reworked MSG_PEEK test for SOCK_STREAM. Previous version just
   sent single byte, then tried to read it with MSG_PEEK flag, then read
   it in normal way. New version is more complex: now sender uses buffer
   instead of single byte and this buffer is initialized with random
   values. Receiver tests several things:
   1) Read empty socket with MSG_PEEK flag.
   2) Read part of buffer with MSG_PEEK flag.
   3) Read whole buffer with MSG_PEEK flag, then checks that it is same
      as buffer from 2) (limited by size of buffer from 2) of course).
   4) Read whole buffer without any flags, then checks that it is same
      as buffer from 3).

4) This is MSG_PEEK test for SOCK_SEQPACKET. It works in the same way
   as for SOCK_STREAM, except it also checks combination of MSG_TRUNC
   and MSG_PEEK.
====================

Link: https://lore.kernel.org/r/20230725172912.1659970-1-AVKrasnov@sberdevices.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agovsock/test: MSG_PEEK test for SOCK_SEQPACKET
Arseniy Krasnov [Tue, 25 Jul 2023 17:29:12 +0000 (20:29 +0300)]
vsock/test: MSG_PEEK test for SOCK_SEQPACKET

This adds MSG_PEEK test for SOCK_SEQPACKET. It works in the same way as
SOCK_STREAM test, except it also tests MSG_TRUNC flag.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agovsock/test: rework MSG_PEEK test for SOCK_STREAM
Arseniy Krasnov [Tue, 25 Jul 2023 17:29:11 +0000 (20:29 +0300)]
vsock/test: rework MSG_PEEK test for SOCK_STREAM

This new version makes test more complicated by adding empty read,
partial read and data comparisons between MSG_PEEK and normal reads.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agovirtio/vsock: support MSG_PEEK for SOCK_SEQPACKET
Arseniy Krasnov [Tue, 25 Jul 2023 17:29:10 +0000 (20:29 +0300)]
virtio/vsock: support MSG_PEEK for SOCK_SEQPACKET

This adds support of MSG_PEEK flag for SOCK_SEQPACKET type of socket.
Difference with SOCK_STREAM is that this callback returns either length
of the message or error.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agovirtio/vsock: rework MSG_PEEK for SOCK_STREAM
Arseniy Krasnov [Tue, 25 Jul 2023 17:29:09 +0000 (20:29 +0300)]
virtio/vsock: rework MSG_PEEK for SOCK_STREAM

This reworks current implementation of MSG_PEEK logic:
1) Replaces 'skb_queue_walk_safe()' with 'skb_queue_walk()'. There is
   no need in the first one, as there are no removes of skb in loop.
2) Removes nested while loop - MSG_PEEK logic could be implemented
   without it: just iterate over skbs without removing it and copy
   data from each until destination buffer is not full.

Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agolib/ts_bm: add helper to reduce indentation and improve readability
Jeremy Sowden [Tue, 20 Jun 2023 18:11:00 +0000 (19:11 +0100)]
lib/ts_bm: add helper to reduce indentation and improve readability

The flow-control of `bm_find` is very deeply nested with a conditional
comparing a ternary expression against the pattern inside a for-loop
inside a while-loop inside a for-loop.

Move the inner for-loop into a helper function to reduce the amount of
indentation and make the code easier to read.

Fix indentation and trailing white-space in preceding debug logging
statement.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
11 months agonetfilter: conntrack: validate cta_ip via parsing
Lin Ma [Wed, 12 Jul 2023 13:32:36 +0000 (21:32 +0800)]
netfilter: conntrack: validate cta_ip via parsing

In current ctnetlink_parse_tuple_ip() function, nested parsing and
validation is splitting as two parts,  which could be cleanup to a
simplified form. As the nla_parse_nested_deprecated function
supports validation in the fly. These two finially reach same place
__nla_validate_parse with same validate flag.

nla_parse_nested_deprecated
  __nla_parse(.., NL_VALIDATE_LIBERAL, ..)
    __nla_validate_parse

nla_validate_nested_deprecated
  __nla_validate_nested(.., NL_VALIDATE_LIBERAL, ..)
    __nla_validate
      __nla_validate_parse

This commit removes the call to nla_validate_nested_deprecated and pass
cta_ip_nla_policy when do parsing.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Westphal <fw@strlen.de>