git.osdn.net Git - tomoyo/tomoyo-test1.git/log

Merge branch 'create-netdevsim-instances-in-namespace'

Jiri Pirko says:

====================
create netdevsim instances in namespace

Allow user to create netdevsim devlink and netdevice instances in a
network namespace according to the namespace where the user resides in.
Add a selftest to test this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: test creating netdevsim inside network namespace

Add a test that creates netdevsim instance inside network namespace
and verifies that the related devlink instance and port netdevices
reside in the namespace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: create devlink and netdev instances in namespace

When user does create new netdevsim instance using sysfs bus file,
create the devlink instance and related netdev instance in the namespace
of the caller.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: devlink: export devlink net setter

For newly allocated devlink instance allow drivers to set net struct

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-tls-add-ctrl-path-tracing-and-statistics'

Jakub Kicinski says:

====================
net/tls: add ctrl path tracing and statistics

This set adds trace events related to TLS offload and basic MIB stats
for TLS.

First patch contains the TLS offload related trace points. Those are
helpful in troubleshooting offload issues, especially around the
resync paths.

Second patch adds a tracepoint to the fastpath of device offload,
it's separated out in case there will be objections to adding
fast path tracepoints. Again, it's quite useful for debugging
offload issues.

Next four patches add MIB statistics. The statistics are implemented
as per-cpu per-netns counters. Since there are currently no fast path
statistics we could move to atomic variables. Per-CPU seem more common.

Most basic statistics are number of created and live sessions, broken
out to offloaded and non-offloaded. Users seem to like those a lot.

Next there is a statistic for decryption errors. These are primarily
useful for device offload debug, in normal deployments decryption
errors should not be common.

Last but not least a counter for device RX resync.
====================

Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add TlsDeviceRxResync statistic

Add a statistic for number of RX resyncs sent down to the NIC.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add TlsDecryptError stat

Add a statistic for TLS record decryption errors.

Since devices are supposed to pass records as-is when they
encounter errors this statistic will count bad records in
both pure software and inline crypto configurations.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add statistics for installed sessions

Add SNMP stats for number of sockets with successfully
installed sessions. Break them down to software and
hardware ones. Note that if hardware offload fails
stack uses software implementation, and counts the
session appropriately.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add skeleton of MIB statistics

Add a skeleton structure for adding TLS statistics.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add device decrypted trace point

Add a tracepoint to the TLS offload's fast path. This tracepoint
can be used to track the decrypted and encrypted status of received
records. Records decrypted by the device should have decrypted set
to 1, records which have neither decrypted nor decrypted set are
partially decrypted, require re-encryption and therefore are most
expensive to deal with.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: add tracing for device/offload events

Add tracing of device-related interaction to aid performance
analysis, especially around resync:

tls:tls_device_offload_set
tls:tls_device_rx_resync_send
tls:tls_device_rx_resync_nh_schedule
tls:tls_device_rx_resync_nh_delay
tls:tls_device_tx_resync_req
tls:tls_device_tx_resync_send

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/netdev/net

Merge tag 'kbuild-fixes-v5.4' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- remove unneeded ar-option and KBUILD_ARFLAGS

- remove long-deprecated SUBDIRS

- fix modpost to suppress false-positive warnings for UML builds

- fix namespace.pl to handle relative paths to ${objtree}, ${srctree}

- make setlocalversion work for /bin/sh

- make header archive reproducible

- fix some Makefiles and documents

* tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kheaders: make headers archive reproducible
  kbuild: update compile-test header list for v5.4-rc2
  kbuild: two minor updates for Documentation/kbuild/modules.rst
  scripts/setlocalversion: clear local variable to make it work for sh
  namespace: fix namespace.pl script to support relative paths
  video/logo: do not generate unneeded logo C files
  video/logo: remove unneeded *.o pattern from clean-files
  integrity: remove pointless subdir-$(CONFIG_...)
  integrity: remove unneeded, broken attempt to add -fshort-wchar
  modpost: fix static EXPORT_SYMBOL warnings for UML build
  kbuild: correct formatting of header in kbuild module docs
  kbuild: remove SUBDIRS support
  kbuild: remove ar-option and KBUILD_ARFLAGS

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Twelve patches mostly small but obvious fixes or cosmetic but small
  updates"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix Nport ID display value
  scsi: qla2xxx: Fix N2N link up fail
  scsi: qla2xxx: Fix N2N link reset
  scsi: qla2xxx: Optimize NPIV tear down process
  scsi: qla2xxx: Fix stale mem access on driver unload
  scsi: qla2xxx: Fix unbound sleep in fcport delete path.
  scsi: qla2xxx: Silence fwdump template message
  scsi: hisi_sas: Make three functions static
  scsi: megaraid: disable device when probe failed after enabled device
  scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
  scsi: qedf: Remove always false 'tmp_prio < 0' statement
  scsi: ufs: skip shutdown if hba is not powered
  scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF

Merge branch 'readdir' (readdir speedup and sanity checking)

This makes getdents() and getdents64() do sanity checking on the
pathname that it gives to user space.  And to mitigate the performance
impact of that, it first cleans up the way it does the user copying, so
that the code avoids doing the SMAP/PAN updates between each part of the
dirent structure write.

I really wanted to do this during the merge window, but didn't have
time.  The conversion of filldir to unsafe_put_user() is something I've
had around for years now in a private branch, but the extra pathname
checking finally made me clean it up to the point where it is mergable.

It's worth noting that the filename validity checking really should be a
bit smarter: it would be much better to delay the error reporting until
the end of the readdir, so that non-corrupted filenames are still
returned.  But that involves bigger changes, so let's see if anybody
actually hits the corrupt directory entry case before worrying about it
further.

* branch 'readdir':
  Make filldir[64]() verify the directory entry filename is valid
  Convert filldir[64]() from __put_user() to unsafe_put_user()

Make filldir[64]() verify the directory entry filename is valid

This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().

This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.

There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.

There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption.  It's really more
"protect user space from bad behavior" as pointed out by Jann.  But
since Eric Biederman looked up the POSIX wording, here it is for context:

"From readdir:

   The readdir() function shall return a pointer to a structure
   representing the directory entry at the current position in the
   directory stream specified by the argument dirp, and position the
   directory stream at the next entry. It shall return a null pointer
   upon reaching the end of the directory stream. The structure dirent
   defined in the <dirent.h> header describes a directory entry.

  From definitions:

   3.129 Directory Entry (or Link)

   An object that associates a filename with a file. Several directory
   entries can associate names with the same file.

  ...

   3.169 Filename

   A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
   characters composing the name may be selected from the set of all
   character values excluding the slash character and the null byte. The
   filenames dot and dot-dot have special meaning. A filename is
   sometimes referred to as a 'pathname component'."

Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.

Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.

We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')

See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.

Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Convert filldir[64]() from __put_user() to unsafe_put_user()

We really should avoid the "__{get,put}_user()" functions entirely,
because they can easily be mis-used and the original intent of being
used for simple direct user accesses no longer holds in a post-SMAP/PAN
world.

Manually optimizing away the user access range check makes no sense any
more, when the range check is generally much cheaper than the "enable
user accesses" code that the __{get,put}_user() functions still need.

So instead of __put_user(), use the unsafe_put_user() interface with
user_access_{begin,end}() that really does generate better code these
days, and which is generally a nicer interface.  Under some loads, the
multiple user writes that filldir() does are actually quite noticeable.

This also makes the dirent name copy use unsafe_put_user() with a couple
of macros.  We do not want to make function calls with SMAP/PAN
disabled, and the code this generates is quite good when the
architecture uses "asm goto" for unsafe_put_user() like x86 does.

Note that this doesn't bother with the legacy cases.  Nobody should use
them anyway, so performance doesn't really matter there.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold.

2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric
    Dumazet.

3) txq null deref in mac80211, from Miaoqing Pan.

4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann.

5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso.

6) Avoid division by zero in taprio scheduler, from Vladimir Oltean.

7) Various xgmac fixes in stmmac driver from Jose Abreu.

8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from
    Michal Kubecek.

9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij.

10) Fix sleep while atomic in sja1105, from Vladimir Oltean.

11) Suspend/resume deadlock in stmmac, from Thierry Reding.

12) Various UDP GSO fixes from Josh Hunt.

13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric
    Dumazet.

14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern.

15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
  selftests/net: add nettest to .gitignore
  net: qlogic: Fix memory leak in ql_alloc_large_buffers
  nfc: fix memory leak in llcp_sock_bind()
  sch_dsmark: fix potential NULL deref in dsmark_init()
  net: phy: at803x: use operating parameters from PHY-specific status
  net: phy: extract pause mode
  net: phy: extract link partner advertisement reading
  net: phy: fix write to mii-ctrl1000 register
  ipv6: Handle missing host route in __ipv6_ifa_notify
  net: phy: allow for reset line to be tied to a sleepy GPIO controller
  net: ipv4: avoid mixed n_redirects and rate_tokens usage
  r8152: Set macpassthru in reset_resume callback
  cxgb4:Fix out-of-bounds MSI-X info array access
  Revert "ipv6: Handle race in addrconf_dad_work"
  net: make sock_prot_memory_pressure() return "const char *"
  rxrpc: Fix rxrpc_recvmsg tracepoint
  qmi_wwan: add support for Cinterion CLS8 devices
  tcp: fix slab-out-of-bounds in tcp_zerocopy_receive()
  lib: textsearch: fix escapes in example code
  udp: only do GSO if # of segs > 1
  ...

Merge tag 's390-5.4-3' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

- defconfig updates

- Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i"
   constraint for function arguments. Two kvm changes acked-by Christian
   Borntraeger.

- Fix -Wunused-but-set-variable warnings in mm code.

- Avoid a constant misuse in qdio.

- Handle a case when cpumf is temporarily unavailable.

* tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  KVM: s390: mark __insn32_query() as __always_inline
  KVM: s390: fix __insn32_query() inline assembly
  s390: update defconfigs
  s390/pci: mark function(s) __always_inline
  s390/mm: mark function(s) __always_inline
  s390/jump_label: mark function(s) __always_inline
  s390/cpu_mf: mark function(s) __always_inline
  s390/atomic,bitops: mark function(s) __always_inline
  s390/mm: fix -Wunused-but-set-variable warnings
  s390: mark __cpacf_query() as __always_inline
  s390/qdio: clarify size of the QIB parm area
  s390/cpumf: Fix indentation in sampling device driver
  s390/cpumsf: Check for CPU Measurement sampling
  s390/cpumf: Use consistant debug print format

KVM: s390: mark __insn32_query() as __always_inline

__insn32_query() will not compile if the compiler decides to not
inline it, since it contains an inline assembly with an "i" constraint
with variable contents.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

KVM: s390: fix __insn32_query() inline assembly

The inline assembly constraints of __insn32_query() tell the compiler
that only the first byte of "query" is being written to. Intended was
probably that 32 bytes are written to.

Fix and simplify the code and just use a "memory" clobber.

Fixes: d668139718a9 ("KVM: s390: provide query function for instructions returning 32 byte")
Cc: stable@vger.kernel.org # v5.2+
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

kheaders: make headers archive reproducible

In commit 43d8ce9d65a5 ("Provide in-kernel headers to make
extending kernel easier") a new mechanism was introduced, for kernels
>=5.2, which embeds the kernel headers in the kernel image or a module
and exposes them in procfs for use by userland tools.

The archive containing the header files has nondeterminism caused by
header files metadata. This patch normalizes the metadata and utilizes
KBUILD_BUILD_TIMESTAMP if provided and otherwise falls back to the
default behaviour.

In commit f7b101d33046 ("kheaders: Move from proc to sysfs") it was
modified to use sysfs and the script for generation of the archive was
renamed to what is being patched.

Signed-off-by: Dmitry Goldin <dgoldin+lkml@protonmail.ch>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

kbuild: update compile-test header list for v5.4-rc2

Commit 6dc280ebeed2 ("coda: remove uapi/linux/coda_psdev.h") removed
a header in question. Some more build errors were fixed. Add more
headers into the test coverage.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

kbuild: two minor updates for Documentation/kbuild/modules.rst

Capitalize the first word in the sentence.

Use obj-m instead of obj-y. obj-y still works, but we have no built-in
objects in external module builds. So, obj-m is better IMHO.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

scripts/setlocalversion: clear local variable to make it work for sh

Geert Uytterhoeven reports a strange side-effect of commit 858805b336be
("kbuild: add $(BASH) to run scripts with bash-extension"), which
inserts the contents of a localversion file in the build directory twice.

[Steps to Reproduce]
  $ echo bar > localversion
  $ mkdir build
  $ cd build/
  $ echo foo > localversion
  $ make -s -f ../Makefile defconfig include/config/kernel.release
  $ cat include/config/kernel.release
  5.4.0-rc1foofoobar

This comes down to the behavior change of local variables.

The 'man sh' on my Ubuntu machine, where sh is an alias to dash,
explains as follows:
  When a variable is made local, it inherits the initial value and
  exported and readonly flags from the variable with the same name
  in the surrounding scope, if there is one. Otherwise, the variable
  is initially unset.

[Test Code]

  foo ()
  {
          local res
          echo "res: $res"
  }

  res=1
  foo

[Result]

  $ sh test.sh
  res: 1
  $ bash test.sh
  res:

So, scripts/setlocalversion correctly works only for bash in spite of
its hashbang being #!/bin/sh. Nobody had noticed it before because
CONFIG_SHELL was previously set to bash almost all the time.

Now that CONFIG_SHELL is set to sh, we must write portable and correct
code. I gave the Fixes tag to the commit that uncovered the issue.

Clear the variable 'res' in collect_files() to make it work for sh
(and it also works on distributions where sh is an alias to bash).

Fixes: 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

namespace: fix namespace.pl script to support relative paths

The namespace.pl script does not work properly if objtree is not set to
an absolute path. The do_nm function is run from within the find
function, which changes directories.

Because of this, appending objtree, $File::Find::dir, and $source, will
return a path which is not valid from the current directory.

This used to work when objtree was set to an absolute path when using
"make namespacecheck". It appears to have not worked when calling
./scripts/namespace.pl directly.

This behavior was changed in 7e1c04779efd ("kbuild: Use relative path
for $(objtree)", 2014-05-14)

Rather than fixing the Makefile to set objtree to an absolute path, just
fix namespace.pl to work when srctree and objtree are relative. Also fix
the script to use an absolute path for these by default.

Use the File::Spec module for this purpose. It's been part of perl
5 since 5.005.

The curdir() function is used to get the current directory when the
objtree and srctree aren't set in the environment.

rel2abs() is used to convert possibly relative objtree and srctree
environment variables to absolute paths.

Finally, the catfile() function is used instead of string appending
paths together, since this is more robust when joining paths together.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

video/logo: do not generate unneeded logo C files

Currently, all the logo C files are generated irrespective of the
CONFIG options. Adding them to extra-y is wrong. What we need to do
here is to add them to 'targets' so that if_changed works properly.

Files listed in 'targets' are cleaned, so clean-files is unneeded.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

video/logo: remove unneeded *.o pattern from clean-files

The pattern *.o is cleaned up globally by the top Makefile.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

integrity: remove pointless subdir-$(CONFIG_...)

The ima/ and evm/ sub-directories contain built-in objects, so
obj-$(CONFIG_...) is the correct way to descend into them.

subdir-$(CONFIG_...) is redundant.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

integrity: remove unneeded, broken attempt to add -fshort-wchar

I guess commit 15ea0e1e3e18 ("efi: Import certificates from UEFI Secure
Boot") attempted to add -fshort-wchar for building load_uefi.o, but it
has never worked as intended.

load_uefi.o is created in the platform_certs/ sub-directory. If you
really want to add -fshort-wchar, the correct code is:

$(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar

But, you do not need to fix it.

Commit 8c97023cf051 ("Kbuild: use -fshort-wchar globally") had already
added -fshort-wchar globally. This code was unneeded in the first place.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

selftests/net: add nettest to .gitignore

nettest is missing from gitignore.

Fixes: acda655fefae ("selftests: Add nettest")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: qlogic: Fix memory leak in ql_alloc_large_buffers

In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb.
This skb should be released if pci_dma_mapping_error fails.

Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: fix memory leak in llcp_sock_bind()

sysbot reported a memory leak after a bind() has failed.

While we are at it, abort the operation if kmemdup() has failed.

BUG: memory leak
unreferenced object 0xffff888105d83ec0 (size 32):
  comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s)
  hex dump (first 32 bytes):
    00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34  .ile read.net:[4
    30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00  026533097]......
  backtrace:
    [<0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline]
    [<0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline]
    [<0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline]
    [<0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline]
    [<0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670
    [<000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120
    [<000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline]
    [<000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107
    [<000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647
    [<00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline]
    [<00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline]
    [<00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656
    [<0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296
    [<000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 30cc4587659e ("NFC: Move LLCP code to the NFC top level diirectory")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sch_dsmark: fix potential NULL deref in dsmark_init()

Make sure TCA_DSMARK_INDICES was provided by the user.

syzbot reported :

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline]
RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline]
RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339
Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca
RSP: 0018:ffff88809426f3b8 EFLAGS: 00010247
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff85c6eb09
RDX: 0000000000000000 RSI: ffffffff85c6eb17 RDI: 0000000000000004
RBP: ffff88809426f4b0 R08: ffff88808c4085c0 R09: ffffed1015d26159
R10: ffffed1015d26158 R11: ffff8880ae930ac7 R12: ffff8880a7e96940
R13: dffffc0000000000 R14: ffff88809426f8c0 R15: 0000000000000000
FS: 0000000001292880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000080 CR3: 000000008ca1b000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237
tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653
rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:657
___sys_sendmsg+0x803/0x920 net/socket.c:2311
__sys_sendmsg+0x105/0x1d0 net/socket.c:2356
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg net/socket.c:2363 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440369

Fixes: 758cc43c6d73 ("[PKT_SCHED]: Fix dsmark to apply changes consistent")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Fix-regression-with-AR8035-speed-downgrade'

Russell King says:

====================
Fix regression with AR8035 speed downgrade

The following series attempts to address an issue spotted by tinywrkb
with the AR8035 on the Cubox-i2 in a situation where the PHY downgrades
the negotiated link.

This is version 2, not much has changed other than rebasing on the
current net tree.  Changes have happend to patch 2 due to conflicts,
so I dropped Andrew's reviewed-by.  Minor context changes to patch 4
which I don't consider important enough to warrant dropping the
reviewed-by.

Before commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in
genphy_read_status"), we would read not only the link partner's
advertisement, but also our own advertisement from the PHY registers,
and use both to derive the PHYs current link mode.  This works when the
AR8035 downgrades the speed, because it appears that the AR8035 clears
link mode bits in the advertisement registers as part of the downgrade.

Commentary: what is not yet known is whether the AR8035 restores the
            advertisement register when the link goes down to the
    previous state.

However, since the above referenced commit, we no longer use the PHYs
advertisement registers, instead converting the link partner's
advertisement to the ethtool link mode array, and combine that with
phylib's cached version of our advertisement - which is not updated on
speed downgrade.

This results in phylib disagreeing with the actual operating mode of
the PHY.

Commentary: I wonder how many more PHY drivers are broken by this
    commit, but have yet to be discovered.

The obvious way to address this would be to disable the downgrade
feature, and indeed this does fix the problem in tinywrkb's case - his
link partner instead downgrades the speed by reducing its
advertisement, resulting in phylib correctly evaluating a slower speed.

However, it has a serious drawback - the gigabit control register (MII
register 9) appears to become read only.  It seems the only way to
update the register is to re-enable the downgrade feature, reset the
PHY, changing register 9, disable the downgrade feature, and reset the
PHY again.

This series attempts to address the problem using a different approach,
similar to the approach taken with Marvell PHYs.  The AR8031, AR8033
and AR8035 have a PHY-Specific Status register which reports the
actual operating mode of the PHY - both speed and duplex.  This
register correctly reports the operating mode irrespective of whether
autoneg is enabled or not.  We use this register to fill in phylib's
speed and duplex parameters.

In detail:

Patch 1 fixes a bug where writing to register 9 does not update
phylib's advertisement mask in the same way that writing register 4
does; this looks like an omission from when gigabit PHY support came
into being.

Patch 2 seperates the generic phylib code which reads the link partners
advertisement from the PHY, so that we can re-use this in the Atheros
PHY driver.

Patch 3 seperates the generic phylib pause mode; phylib provides no
help for MAC drivers to ascertain the negotiated pause mode, it merely
copies the link partner's pause mode bits into its own variables.

Commentary: Both the aforementioned Atheros PHYs and Marvell PHYs
            provide the resolved pause modes in terms of whether
    we should transmit pause frames, or whether we should
    allow reception of pause frames.  Surely the resolution
    of this should be in phylib?

Patch 4 provides the Atheros PHY driver with a private "read_status"
implementation that fills in phylib's speed and duplex settings
depending on the PHY-Specific status register.  This ensures that
phylib and the MAC driver match the operating mode that the PHY has
decided to use.  Since the register also gives us MDIX status, we
can trivially fill that status in as well.

Note that, although the bits mentioned in this patch for this register
match those in th Marvell PHY driver, and it is located at the same
address, the meaning of other register bits varies between the PHYs.
Therefore, I do not feel that it would be appropriate to make this some
kind of generic function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: at803x: use operating parameters from PHY-specific status

Read the PHY-specific status register for the current operating mode
(speed and duplex) of the PHY. This register reflects the actual
mode that the PHY has resolved depending on either the advertisements
of autoneg is enabled, or the forced mode if autoneg is disabled.

This ensures that phylib's software state always tracks the hardware
state.

It seems both AR8033 (which uses the AR8031 ID) and AR8035 support
this status register. AR8030 is not known at the present time.

This patch depends on "net: phy: extract pause mode" and "net: phy:
extract link partner advertisement reading".

Reported-by: tinywrkb <tinywrkb@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: tinywrkb <tinywrkb@gmail.com>
Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: extract pause mode

Extract the update of phylib's software pause mode state from
genphy_read_status(), so that we can re-use this functionality with
PHYs that have alternative ways to read the negotiation results.

Tested-by: tinywrkb <tinywrkb@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: extract link partner advertisement reading

Move reading the link partner advertisement out of genphy_read_status()
into its own separate function. This will allow re-use of this code by
PHY drivers that are able to read the resolved status from the PHY.

Tested-by: tinywrkb <tinywrkb@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: fix write to mii-ctrl1000 register

When userspace writes to the MII_ADVERTISE register, we update phylib's
advertising mask and trigger a renegotiation. However, writing to the
MII_CTRL1000 register, which contains the gigabit advertisement, does
neither. This can lead to phylib's copy of the advertisement becoming
de-synced with the values in the PHY register set, which can result in
incorrect negotiation resolution.

Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Handle missing host route in __ipv6_ifa_notify

Rajendra reported a kernel panic when a link was taken down:

    [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
    [ 6870.271856] IP: [<ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290

    <snip>

    [ 6870.570501] Call Trace:
    [ 6870.573238] [<ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
    [ 6870.579665] [<ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
    [ 6870.586869] [<ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
    [ 6870.593491] [<ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
    [ 6870.600305] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
    [ 6870.606732] [<ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
    [ 6870.613449] [<ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
    [ 6870.619778] [<ffffffff8ea93d20>] ? process_one_work+0x430/0x430
    [ 6870.626495] [<ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
    [ 6870.632145] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
    [ 6870.638573] [<ffffffff8ea99d00>] ? kthread_park+0x60/0x60
    [ 6870.644707] [<ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
    [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0

addrconf_dad_work is kicked to be scheduled when a device is brought
up. There is a race between addrcond_dad_work getting scheduled and
taking the rtnl lock and a process taking the link down (under rtnl).
The latter removes the host route from the inet6_addr as part of
addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
to use the host route in __ipv6_ifa_notify. If the down event removes
the host route due to the race to the rtnl, then the BUG listed above
occurs.

Since the DAD sequence can not be aborted, add a check for the missing
host route in __ipv6_ifa_notify. The only way this should happen is due
to the previously mentioned race. The host route is created when the
address is added to an interface; it is only removed on a down event
where the address is kept. Add a warning if the host route is missing
AND the device is up; this is a situation that should never happen.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: allow for reset line to be tied to a sleepy GPIO controller

mdio_device_reset() makes use of the atomic-pretending API flavor for
handling the PHY reset GPIO line.

I found no hint that mdio_device_reset() is called from atomic context
and indeed it uses usleep_range() since long time, so I would assume that
it is OK to sleep there.

This patch switch to gpiod_set_value_cansleep() in mdio_device_reset().
This is relevant if e.g. the PHY reset line is tied to a I2C GPIO
controller.

This has been tested on a ZynqMP board running an upstream 4.19 kernel and
then hand-ported on current kernel tree.

Signed-off-by: Andrea Merello <andrea.merello@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv4: avoid mixed n_redirects and rate_tokens usage

Since commit c09551c6ff7f ("net: ipv4: use a dedicated counter
for icmp_v4 redirect packets") we use 'n_redirects' to account
for redirect packets, but we still use 'rate_tokens' to compute
the redirect packets exponential backoff.

If the device sent to the relevant peer any ICMP error packet
after sending a redirect, it will also update 'rate_token' according
to the leaking bucket schema; typically 'rate_token' will raise
above BITS_PER_LONG and the redirect packets backoff algorithm
will produce undefined behavior.

Fix the issue using 'n_redirects' to compute the exponential backoff
in ip_rt_send_redirect().

Note that we still clear rate_tokens after a redirect silence period,
to avoid changing an established behaviour.

The root cause predates git history; before the mentioned commit in
the critical scenario, the kernel stopped sending redirects, after
the mentioned commit the behavior more randomic.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: Set macpassthru in reset_resume callback

r8152 may fail to establish network connection after resume from system
suspend.

If the USB port connects to r8152 lost its power during system suspend,
the MAC address was written before is lost. The reason is that The MAC
address doesn't get written again in its reset_resume callback.

So let's set MAC address again in reset_resume callback. Also remove
unnecessary lock as no other locking attempt will happen during
reset_resume.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: Make function sja1105_xfer_long_buf static

Fix sparse warnings:

drivers/net/dsa/sja1105/sja1105_spi.c:159:5: warning: symbol 'sja1105_xfer_long_buf' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: devlink: don't ignore errors during dumpit

Currently, some dumpit function may end-up with error which is not
-EMSGSIZE and this error is silently ignored. Use does not have clue
that something wrong happened. Instead of silent ignore, propagate
the error to user.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: Add support for port mirroring

Amazingly, of all features, this does not require a switch reset.

Tested with:

tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress matchall skip_sw \
action mirred egress mirror dev swp3
tc filter show dev swp2 ingress
tc filter del dev swp2 ingress pref 49152

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4:Fix out-of-bounds MSI-X info array access

When fetching free MSI-X vectors for ULDs, check for the error code
before accessing MSI-X info array. Otherwise, an out-of-bounds access is
attempted, which results in kernel panic.

Fixes: 94cdb8bb993a ("cxgb4: Add support for dynamic allocation of resources for ULD")
Signed-off-by: Shahjada Abul Husain <shahjada@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "ipv6: Handle race in addrconf_dad_work"

This reverts commit a3ce2a21bb8969ae27917281244fa91bf5f286d7.

Eric reported tests failings with commit. After digging into it,
the bottom line is that the DAD sequence is not to be messed with.
There are too many cases that are expected to proceed regardless
of whether a device is up.

Revert the patch and I will send a different solution for the
problem Rajendra reported.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: make sock_prot_memory_pressure() return "const char *"

This function returns string literals which are "const char *".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igmp: uninline ip_mc_validate_checksum()

This function is only used via function pointer.

"inline" doesn't hurt given that taking address of an inline function
forces out-of-line version but it doesn't help either.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: spread "enum sock_flags"

Some ints are "enum sock_flags" in fact.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net, uapi: fix -Wpointer-arith warnings

Add casts to fix these warnings:

./usr/include/linux/netfilter_arp/arp_tables.h:200:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_bridge/ebtables.h:197:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_ipv4/ip_tables.h:223:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/netfilter_ipv6/ip6_tables.h:263:19: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/tipc_config.h:310:28: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/tipc_config.h:410:24: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]
./usr/include/linux/virtio_ring.h:170:16: error: pointer of type 'void *' used in arithmetic [-Werror=pointer-arith]

Those are theoretical probably but kernel doesn't control compiler flags
in userspace.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-phy-broadcom-RGMII-delays-fixes'

Florian Fainelli says:

====================
net: phy: broadcom: RGMII delays fixes

This patch series fixes the BCM54210E RGMII delay configuration which
could only have worked in a PHY_INTERFACE_MODE_RGMII configuration.
There is a forward declaration added such that the first patch can be
picked up for -stable and apply fine all the way back to when the bug
was introduced.

The second patch eliminates duplicated code that used a different kind
of logic and did not use existing constants defined.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: broadcom: Use bcm54xx_config_clock_delay() for BCM54612E

bcm54612e_config_init() duplicates what bcm54xx_config_clock_delay()
does with respect to configuring RGMII TX/RX delays appropriately.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: broadcom: Fix RGMII delays configuration for BCM54210E

Commit 0fc9ae107669 ("net: phy: broadcom: add support for
BCM54210E") added support for BCM54210E but also unconditionally cleared
the RXC to RXD skew and the TXD to TXC skew, thus only making
PHY_INTERFACE_MODE_RGMII a possible configuration. Use
bcm54xx_config_clock_delay() which correctly sets the registers
depending on the 4 possible PHY interface values that exist for RGMII.

Fixes: 0fc9ae107669 ("net: phy: broadcom: add support for BCM54210E")
Reported-by: Manasa Mudireddy <manasa.mudireddy@broadcom.com>
Reported-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-tls-separate-the-TLS-TOE-code-out'

Jakub Kicinski says:

====================
net/tls: separate the TLS TOE code out

We have 3 modes of operation of TLS - software, crypto offload
(Mellanox, Netronome) and TCP Offload Engine-based (Chelsio).
The last one takes over the socket, like any TOE would, and
is not really compatible with how we want to do things in the
networking stack.

Confusingly the name of the crypto-only offload mode is TLS_HW,
while TOE-offload related functions use tls_hw_ as their prefix.

Engineers looking to implement offload are also be faced with
TOE artefacts like struct tls_device (while, again,
CONFIG_TLS_DEVICE actually gates the non-TOE offload).

To improve the clarity of the offload code move the TOE code
into new files, and rename the functions and structures
appropriately.

Because TOE-offload takes over the socket, and makes no use of
the TLS infrastructure in the kernel, the rest of the code
(anything beyond the ULP setup handlers) do not have to worry
about the mode == TLS_HW_RECORD case.

The increase in code size is due to duplication of the full
license boilerplate. Unfortunately original author (Dave Watson)
seems unreachable :(
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: allow compiling TLS TOE out

TLS "record layer offload" requires TOE, and bypasses most of
the normal networking stack. It is also significantly less
maintained. Allow users to compile it out to avoid issues.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: rename tls_hw_* functions tls_toe_*

The tls_hw_* functions are quite confusingly named, since they
are related to the TOE-offload, not TLS_HW offload which doesn't
require TOE. Rename them.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: move TOE-related code to a separate file

Move tls_hw_* functions to a new, separate source file
to avoid confusion with normal, non-TOE offload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: move tls_build_proto() on init path

Move tls_build_proto() so that TOE offload doesn't have to call it
mid way through its bypass enable path.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: rename tls_device to tls_toe_device

Rename struct tls_device to struct tls_toe_device to avoid
confusion with normal, non-TOE offload.

No functional changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: move TOE-related structures to a separate header

Move tls_device structure and register/unregister functions
to a new header to avoid confusion with normal, non-TOE offload.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Add missing "new peer" trace

There was supposed to be a trace indicating that a new peer had been
created. Add it.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Fix rxrpc_recvmsg tracepoint

Fix the rxrpc_recvmsg tracepoint to handle being called with a NULL call
parameter.

Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qmi_wwan: add support for Cinterion CLS8 devices

Add support for Cinterion CLS8 devices.
Use QMI_QUIRK_SET_DTR as required for Qualcomm MDM9x07 chipsets.

T:  Bus=01 Lev=03 Prnt=05 Port=01 Cnt=02 Dev#= 25 Spd=480  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b0 Rev= 3.18
S:  Manufacturer=GEMALTO
S:  Product=USB Modem
C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mips_fixes_5.4_1' of git://git./linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:

- Build fixes for Cavium Octeon & PMC-Sierra MSP systems, as well as
   all pre-MIPSr6 configurations built with binutils < 2.25.

- Boot fixes for 64-bit Loongson systems & SGI IP28 systems.

- Wire up the new clone3 syscall.

- Clean ups for a few build-time warnings.

* tag 'mips_fixes_5.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: fw/arc: Remove unused addr variable
  MIPS: pmcs-msp71xx: Remove unused addr variable
  MIPS: pmcs-msp71xx: Add missing MAX_PROM_MEM definition
  mips: Loongson: Fix the link time qualifier of 'serial_exit()'
  MIPS: init: Prevent adding memory before PHYS_OFFSET
  MIPS: init: Fix reservation of memory between PHYS_OFFSET and mem start
  MIPS: VDSO: Fix build for binutils < 2.25
  MIPS: VDSO: Remove unused gettimeofday.c
  MIPS: Wire up clone3 syscall
  MIPS: octeon: Include required header; fix octeon ethernet build
  MIPS: cpu-bugs64: Mark inline functions as __always_inline
  MIPS: dts: ar9331: fix interrupt-controller size
  MIPS: Loongson64: Fix boot failure after dropping boot_mem_map

Merge tag 'riscv/for-v5.4-rc2' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:

- Ensure that exclusive-load reservations are terminated after system
   call or exception handling. This primarily affects QEMU, which does
   not expire load reservations.

- Fix an issue primarily affecting RV32 platforms that can cause the DT
   header to be corrupted, causing boot failures.

* tag 'riscv/for-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix memblock reservation for device tree blob
  RISC-V: Clear load reservations while restoring hart contexts

Merge tag 'devicetree-fixes-for-5.4' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:
"Fix several 'dt_binding_check' build failures"

* tag 'devicetree-fixes-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: phy: lantiq: Fix Property Name
  dt-bindings: iio: ad7192: Fix DTC warning in the example
  dt-bindings: iio: ad7192: Fix Regulator Properties
  dt-bindings: media: rc: Fix redundant string
  dt-bindings: dsp: Fix fsl,dsp example

MIPS: fw/arc: Remove unused addr variable

The addr variable in prom_free_prom_memory() has been unused since
commit 0df1007677d5 ("MIPS: fw: Record prom memory"), leading to a
compiler warning:

arch/mips/fw/arc/memory.c:163:16:
warning: unused variable 'addr' [-Wunused-variable]

Fix this by removing the unused variable.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 0df1007677d5 ("MIPS: fw: Record prom memory")
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-mips@vger.kernel.org

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"ARM and x86 bugfixes of all kinds.

  The most visible one is that migrating a nested hypervisor has always
  been busted on Broadwell and newer processors, and that has finally
  been fixed"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
  KVM: x86: omit "impossible" pmu MSRs from MSR list
  KVM: nVMX: Fix consistency check on injected exception error code
  KVM: x86: omit absent pmu MSRs from MSR list
  selftests: kvm: Fix libkvm build error
  kvm: vmx: Limit guest PMCs to those supported on the host
  kvm: x86, powerpc: do not allow clearing largepages debugfs entry
  KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure
  KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF
  selftests: kvm: add test for dirty logging inside nested guests
  KVM: x86: fix nested guest live migration with PML
  KVM: x86: assign two bits to track SPTE kinds
  KVM: x86: Expose XSAVEERPTR to the guest
  kvm: x86: Enumerate support for CLZERO instruction
  kvm: x86: Use AMD CPUID semantics for AMD vCPUs
  kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH
  KVM: X86: Fix userspace set invalid CR4
  kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func
  KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns
  KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH
  arm64: KVM: Kill hyp_alternate_select()
  ...

Merge tag 'for-linus-5.4-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes and cleanups from Juergen Gross:

- a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some
   cases, plus a follow-on cleanup series for that driver

- a patch for introducing non-blocking EFI callbacks in Xen's EFI
   driver, plu a cleanup patch for Xen EFI handling merging the x86 and
   ARM arch specific initialization into the Xen EFI driver

- a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning
   up after a user process has died

- a fix for Xen on ARM after removal of ZONE_DMA

- a cleanup patch for avoiding build warnings for Xen on ARM

* tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: fix self-deadlock after killing user process
  xen/efi: have a common runtime setup function
  arm: xen: mm: use __GPF_DMA32 for arm64
  xen/balloon: Clear PG_offline in balloon_retrieve()
  xen/balloon: Mark pages PG_offline in balloon_append()
  xen/balloon: Drop __balloon_append()
  xen/balloon: Set pages PageOffline() in balloon_add_region()
  ARM: xen: unexport HYPERVISOR_platform_op function
  xen/efi: Set nonblocking callbacks

Merge branch 'devlink-allow-devlink-instances-to-change-network-namespace'

Jiri Pirko says:

====================
devlink: allow devlink instances to change network namespace

Devlink from the beginning counts with network namespaces, but the
instances has been fixed to init_net.

Implement change of network namespace as part of "devlink reload"
procedure like this:

$ ip netns add testns1
$ devlink/devlink dev reload netdevsim/netdevsim10 netns testns1

This command reloads device "netdevsim10" into network
namespace "testns1".

Note that "devlink reload" reinstantiates driver objects, effectively it
reloads the driver instance, including possible hw reset etc. Newly
created netdevices respect the network namespace of the parent devlink
instance and according to that, they are created in target network
namespace.

Driver is able to refuse to be reloaded into different namespace. That
is the case of mlx4 right now.

FIB entries and rules are replayed during FIB notifier registration
which is triggered during reload (driver instance init). FIB notifier
is also registered to the target network namespace, that allows user
to use netdevsim devlink resources to setup per-namespace limits of FIB
entries and FIB rules. In fact, with multiple netdevsim instances
in each network namespace, user might setup different limits.
This maintains and extends current netdevsim resources behaviour.

Patch 1 prepares netdevsim code for the follow-up changes in the
patchset. It does not change the behaviour, only moves pet-init_netns
accounting to netdevsim instance, which is also in init_netns.

Patches 2-5 prepare the FIB notifier making it per-netns and to behave
correctly upon error conditions.

Patch 6 just exports a devlink_net helper so it can be used in drivers.

Patches 7-9 do preparations in mlxsw driver.

Patches 10-13 do preparations in netdevsim driver, namely patch 12
implements proper devlink reload where the driver instance objects are
actually re-created as they should be.

Patch 14 actually implements the possibility to reload into a different
network namespace.

Patch 15 adds needed selftests for devlink reload into namespace for
netdevsim driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: netdevsim: add tests for devlink reload with resources

Add couple of tests for devlink reload testing and also resource
limitations testing, along with devlink reload.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: devlink: allow to change namespaces during reload

All devlink instances are created in init_net and stay there for a
lifetime. Allow user to be able to move devlink instances into
namespaces during devlink reload operation. That ensures proper
re-instantiation of driver objects, including netdevices.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: take devlink net instead of init_net

Follow-up patch is going to allow to reload devlink instance into
different network namespace, so use devlink_net() helper instead
of init_net.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: register port netdevices into net of device

Register newly created port netdevice into net namespace
that the parent device belongs to.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: implement proper devlink reload

During devlink reload, all driver objects should be reinstantiated with
the exception of devlink instance and devlink resources and params.
Move existing devlink_resource_size_get() calls into fib_create() just
before fib notifier is registered. Also, make sure that extack is
propagated down to fib_notifier_register() call.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: add all ports in nsim_dev_create() and del them in destroy()

Currently the probe/remove function does this separately. Put the
addition an deletion of ports into nsim_dev_create() and
nsim_dev_destroy().

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: Propagate extack down to register_fib_notifier()

During the devlink reaload the extack is present, so propagate it all
the way down to register_fib_notifier() call in spectrum_router.c.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: Register port netdevices into net of core

When creating netdevices for ports, put them under network namespace
that the core/parent devlink belongs to.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Take devlink net instead of init_net

Follow-up patch is going to allow to reload devlink instance into
different network namespace, so use devlink_net() helper instead
of init_net.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: devlink: export devlink net getter

Allow drivers to get net struct for devlink instance.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fib_notifier: propagate extack down to the notifier block callback

Since errors are propagated all the way up to the caller, propagate
possible extack of the caller all the way down to the notifier block
callback.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Don't rely on missing extack to symbolize dump

Currently if info->extack is NULL, mlxsw assumes that the event came
down from dump. Originally, the dump did not propagate the return value
back to the original caller (fib_notifier_register()). However, that is
now happening. So benefit from this and push the error up if it happened.
Remove rule cases in work handlers that are now dead code.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fib_notifier: propagate possible error during fib notifier registration

Unlike events for registered notifier, during the registration, the
errors that happened for the block being registered are not propagated
up to the caller. Make sure the error is propagated for FIB rules and
entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fib_notifier: make FIB notifier per-netns

Currently all users of FIB notifier only cares about events in init_net.
Later in this patchset, users get interested in other namespaces too.
However, for every registered block user is interested only about one
namespace. Make the FIB notifier registration per-netns and avoid
unnecessary calls of notifier block for other namespaces.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdevsim: change fib accounting and limitations to be per-device

Currently, the accounting is done per-namespace. However, devlink
instance is always in init_net namespace for now, so only the accounting
related to init_net is used. Limitations set using devlink resources
are only considered for init_net. nsim_devlink_net() always
returns init_net always.

Make the accounting per-device. This brings no functional change.
Per-device accounting has the same values as per-net.
For a single netdevsim instance, the behaviour is exactly the same
as before. When multiple netdevsim instances are created, each
can have different limits.

This is in prepare to implement proper devlink netns support. After
that, the devlink instance which would exist in particular netns would
account and limit that netns.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'copy-struct-from-user-v5.4-rc2' of git://git./linux/kernel/git/brauner/linux

Pull copy_struct_from_user() helper from Christian Brauner:
"This contains the copy_struct_from_user() helper which got split out
  from the openat2() patchset. It is a generic interface designed to
  copy a struct from userspace.

  The helper will be especially useful for structs versioned by size of
  which we have quite a few. This allows for backwards compatibility,
  i.e. an extended struct can be passed to an older kernel, or a legacy
  struct can be passed to a newer kernel. For the first case (extended
  struct, older kernel) the new fields in an extended struct can be set
  to zero and the struct safely passed to an older kernel.

  The most obvious benefit is that this helper lets us get rid of
  duplicate code present in at least sched_setattr(), perf_event_open(),
  and clone3(). More importantly it will also help to ensure that users
  implementing versioning-by-size end up with the same core semantics.

  This point is especially crucial since we have at least one case where
  versioning-by-size is used but with slighly different semantics:
  sched_setattr(), perf_event_open(), and clone3() all do do similar
  checks to copy_struct_from_user() while rt_sigprocmask(2) always
  rejects differently-sized struct arguments.

  With this pull request we also switch over sched_setattr(),
  perf_event_open(), and clone3() to use the new helper"

* tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  usercopy: Add parentheses around assignment in test_copy_struct_from_user
  perf_event_open: switch to copy_struct_from_user()
  sched_setattr: switch to copy_struct_from_user()
  clone3: switch to copy_struct_from_user()
  lib: introduce copy_struct_from_user() helper

Merge tag 'for-linus-20191003' of git://git./linux/kernel/git/brauner/linux

Pull clone3/pidfd fixes from Christian Brauner:
"This contains a couple of fixes:

   - Fix pidfd selftest compilation (Shuah Kahn)

     Due to a false linking instruction in the Makefile compilation for
     the pidfd selftests would fail on some systems.

   - Fix compilation for glibc on RISC-V systems (Seth Forshee)

     In some scenarios linux/uapi/linux/sched.h is included where
     __ASSEMBLY__ is defined causing a build failure because struct
     clone_args was not guarded by an #ifndef __ASSEMBLY__.

   - Add missing clone3() and struct clone_args kernel-doc (Christian Brauner)

     clone3() and struct clone_args were missing kernel-docs. (The goal
     is to use kernel-doc for any function or type where it's worth it.)
     For struct clone_args this also contains a comment about the fact
     that it's versioned by size"

* tag 'for-linus-20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  sched: add kernel-doc for struct clone_args
  fork: add kernel-doc for clone3
  selftests: pidfd: Fix undefined reference to pthread_create()
  sched: Add __ASSEMBLY__ guards around struct clone_args

Merge tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Been offline for 3 days, got back and had some fixes queued up.

  Nothing too major, the i915 dp-mst fix is important, and amdgpu has a
  bulk move speedup fix and some regressions, but nothing too insane for
  an rc2 pull. The intel fixes are also 2 weeks worth, they missed the
  boat last week.

  core:
   - writeback fixes

  i915:
   - Fix DP-MST crtc_mask
   - Fix dsc dpp calculations
   - Fix g4x sprite scaling stride check with GTT remapping
   - Fix concurrence on cases where requests where getting retired at
     same time as resubmitted to HW
   - Fix gen9 display resolutions by setting the right max plane width
   - Fix GPU hang on preemption
   - Mark contents as dirty on a write fault. This was breaking cursor
     sprite with dumb buffers.

  komeda:
   - memory leak fix

  tilcdc:
   - include fix

  amdgpu:
   - Enable bulk moves
   - Power metrics fixes for Navi
   - Fix S4 regression
   - Add query for tcc disabled mask
   - Fix several leaks in error paths
   - randconfig fixes
   - clang fixes"

* tag 'drm-fixes-2019-10-04' of git://anongit.freedesktop.org/drm/drm: (21 commits)
  Revert "drm/i915: Fix DP-MST crtc_mask"
  drm/omap: fix max fclk divider for omap36xx
  drm/i915: Fix g4x sprite scaling stride check with GTT remapping
  drm/i915/dp: Fix dsc bpp calculations, v5.
  drm/amd/display: fix dcn21 Makefile for clang
  drm/amd/display: hide an unused variable
  drm/amdgpu: display_mode_vba_21: remove uint typedef
  drm/amdgpu: hide another #warning
  drm/amdgpu: make pmu support optional, again
  drm/amd/display: memory leak
  drm/amdgpu: fix multiple memory leaks in acp_hw_init
  drm/amdgpu: return tcc_disabled_mask to userspace
  drm/amdgpu: don't increment vram lost if we are in hibernation
  Revert "drm/amdgpu: disable stutter mode for renoir"
  drm/amd/powerplay: add sensor lock support for smu
  drm/amd/powerplay: change metrics update period from 1ms to 100ms
  drm/amdgpu: revert "disable bulk moves for now"
  drm/tilcdc: include linux/pinctrl/consumer.h again
  drm/komeda: prevent memory leak in komeda_wb_connector_add
  drm: Clear the fence pointer when writeback job signaled
  ...

Merge tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

- Mandate timespec64 for the io_uring timeout ABI (Arnd)

- Set of NVMe changes via Sagi:
     - controller removal race fix from Balbir
     - quirk additions from Gabriel and Jian-Hong
     - nvme-pci power state save fix from Mario
     - Add 64bit user commands (for 64bit registers) from Marta
     - nvme-rdma/nvme-tcp fixes from Max, Mark and Me
     - Minor cleanups and nits from James, Dan and John

- Two s390 dasd fixes (Jan, Stefan)

- Have loop change block size in DIO mode (Martijn)

- paride pg header ifdef guard (Masahiro)

- Two blk-mq queue scheduler tweaks, fixing an ordering issue on zoned
   devices and suboptimal performance on others (Ming)

* tag 'for-linus-2019-10-03' of git://git.kernel.dk/linux-block: (22 commits)
  block: sed-opal: fix sparse warning: convert __be64 data
  block: sed-opal: fix sparse warning: obsolete array init.
  block: pg: add header include guard
  Revert "s390/dasd: Add discard support for ESE volumes"
  s390/dasd: Fix error handling during online processing
  io_uring: use __kernel_timespec in timeout ABI
  loop: change queue block size to match when using DIO
  blk-mq: apply normal plugging for HDD
  blk-mq: honor IO scheduler for multiqueue devices
  nvme-rdma: fix possible use-after-free in connect timeout
  nvme: Move ctrl sqsize to generic space
  nvme: Add ctrl attributes for queue_count and sqsize
  nvme: allow 64-bit results in passthru commands
  nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T
  nvmet-tcp: remove superflous check on request sgl
  Added QUIRKs for ADATA XPG SX8200 Pro 512GB
  nvme-rdma: Fix max_hw_sectors calculation
  nvme: fix an error code in nvme_init_subsystem()
  nvme-pci: Save PCI state before putting drive into deepest state
  nvme-tcp: fix wrong stop condition in io_work
  ...

s390: update defconfigs

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/pci: mark function(s) __always_inline

Always inline asm inlines with variable operands for "i" constraints,
since they won't compile if the compiler would decide to not inline
them.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/mm: mark function(s) __always_inline

Always inline asm inlines with variable operands for "i" constraints,
since they won't compile if the compiler would decide to not inline
them.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/jump_label: mark function(s) __always_inline

Always inline asm inlines with variable operands for "i" constraints,
since they won't compile if the compiler would decide to not inline
them.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/cpu_mf: mark function(s) __always_inline

Always inline asm inlines with variable operands for "i" constraints,
since they won't compile if the compiler would decide to not inline
them.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/atomic,bitops: mark function(s) __always_inline

Always inline asm inlines with variable operands for "i" constraints,
since they won't compile if the compiler would decide to not inline
them.

Reported-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390/mm: fix -Wunused-but-set-variable warnings

Convert two functions to static inline to get ride of W=1 GCC warnings
like,

mm/gup.c: In function 'gup_pte_range':
mm/gup.c:1816:16: warning: variable 'ptem' set but not used
[-Wunused-but-set-variable]
  pte_t *ptep, *ptem;
                ^~~~

mm/mmap.c: In function 'acct_stack_growth':
mm/mmap.c:2322:16: warning: variable 'new_start' set but not used
[-Wunused-but-set-variable]
  unsigned long new_start;
                ^~~~~~~~~

Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/lkml/1570138596-11913-1-git-send-email-cai@lca.pw/
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

s390: mark __cpacf_query() as __always_inline

arch/s390/kvm/kvm-s390.c calls on several places __cpacf_query() directly,
which makes it impossible to meet the "i" constraint for the asm operands
(opcode in this case).

As we are now force-enabling CONFIG_OPTIMIZE_INLINING on all
architectures, this causes a build failure on s390:

   In file included from arch/s390/kvm/kvm-s390.c:44:
   ./arch/s390/include/asm/cpacf.h: In function '__cpacf_query':
   ./arch/s390/include/asm/cpacf.h:179:2: warning: asm operand 3 probably doesn't match constraints
     179 |  asm volatile(
         |  ^~~
   ./arch/s390/include/asm/cpacf.h:179:2: error: impossible constraint in 'asm'

Mark __cpacf_query() as __always_inline in order to fix that, analogically
how we fixes __cpacf_check_opcode(), cpacf_query_func() and scpacf_query()
already.

Reported-and-tested-by: Michal Kubecek <mkubecek@suse.cz>
Fixes: d83623c5eab2 ("s390: mark __cpacf_check_opcode() and cpacf_query_func() as __always_inline")
Fixes: e60fb8bf68d4 ("s390/cpacf: mark scpacf_query() as __always_inline")
Fixes: ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly")
Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Link: https://lore.kernel.org/lkml/nycvar.YFH.7.76.1910012203010.13160@cbobk.fhfr.pm
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

KVM: x86: omit "impossible" pmu MSRs from MSR list

INTEL_PMC_MAX_GENERIC is currently 32, which exceeds the 18
contiguous MSR indices reserved by Intel for event selectors.
Since some machines actually have MSRs past the reserved range,
filtering them against x86_pmu.num_counters_gp may have false
positives. Cut the list to 18 entries to avoid this.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jim Mattson <jamttson@google.com>
Fixes: e2ada66ec418 ("kvm: x86: Add Intel PMU MSRs to msrs_to_save[]", 2019-08-21)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>