OSDN Git Service

tomoyo/tomoyo-test1.git
5 years agoscsi: mpt3sas: Determine smp affinity on per HBA basis
Sreekanth Reddy [Mon, 24 Jun 2019 14:42:55 +0000 (10:42 -0400)]
scsi: mpt3sas: Determine smp affinity on per HBA basis

Even though 'smp_affinity_enable' module parameter is enabled, if the
number of online CPUs is bigger than the number of msix vectors enabled on
that HBA, then smp affinity settings should be disabled only for this HBA.

But currently the smp affinity setting is disabled globally and hence smp
affinity will be disabled for subsequent HBAs even though number of msix
vectors enabled for this HBA matches the number of online CPU.

To fix this, define a per HBA variable smp_affinity_enable.  Initially this
variable is initialized with smp_affinity_enable module parameter value. If
this HBA has less number of msix vectors configured when compared to number
of online cpus, then only this HBA's variable smp_affinity_enable is set to
zero.

Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Use configured PCIe link speed, not max
Sreekanth Reddy [Mon, 24 Jun 2019 14:42:54 +0000 (10:42 -0400)]
scsi: mpt3sas: Use configured PCIe link speed, not max

When enabling high iops queues, the driver should use the HBA's configured
PCIe link speed instead of looking for the maximum link speed.

I.e. enable high iops queues only if Aero/Sea HBA's configured PCIe link
speed is set to 16GT/s.

Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Remove CPU arch check to determine perf_mode
Sreekanth Reddy [Mon, 24 Jun 2019 14:42:53 +0000 (10:42 -0400)]
scsi: mpt3sas: Remove CPU arch check to determine perf_mode

Currently default perf_mode is set to 'balanced' on Intel architecture
machines and on other machines default perf_mode is set to 'latency' mode.

This CPU architecture check is removed and the default perf_mode mode is
set to 'balanced' mode on all machines.

User can choose the required performance mode using perf_mode module
parameter.

Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs: Documentation: Announce ufs-tool v1.0
Arthur Simchaev [Tue, 25 Jun 2019 12:36:00 +0000 (15:36 +0300)]
scsi: ufs: Documentation: Announce ufs-tool v1.0

The ufs-tool stable release v1.0 is available at:

https://github.com/westerndigitalcorporation/ufs-tool

Feedback and bug reports, as always, are welcomed.

Signed-off-by: Arthur Simchaev <Arthur.Simchaev@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: fix bnx2fc_cmd refcount imbalance in send_srr
Lin Yi [Tue, 25 Jun 2019 02:35:29 +0000 (10:35 +0800)]
scsi: bnx2fc: fix bnx2fc_cmd refcount imbalance in send_srr

If cb_arg alloc failed, we can't release the struct orig_io_req refcount
before we take its refcount. As Saurav said, move the srr_err label down
to avoid unnecessary refcount release and nullptr free.

Signed-off-by: Lin Yi <teroincn@163.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: fix bnx2fc_cmd refcount imbalance in send_rec
Lin Yi [Tue, 25 Jun 2019 02:34:16 +0000 (10:34 +0800)]
scsi: bnx2fc: fix bnx2fc_cmd refcount imbalance in send_rec

If cb_arg alloc failed, we can't release the struct orig_io_req refcount
before we take its refcount. As Saurav said, move the rec_err label down
to avoid unnecessary refcount release and nullptr free.

Signed-off-by: Lin Yi <teroincn@163.com>
Acked-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Update the driver version to 2.12.10
Saurav Kashyap [Mon, 24 Jun 2019 08:30:00 +0000 (01:30 -0700)]
scsi: bnx2fc: Update the driver version to 2.12.10

Update the driver version to 2.12.10.

Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Limit the IO size according to the FW capability
Saurav Kashyap [Mon, 24 Jun 2019 08:29:59 +0000 (01:29 -0700)]
scsi: bnx2fc: Limit the IO size according to the FW capability

 - Reduce the sg_tablesize to 255.

 - Reduce the MAX BDs firmware can handle to 255.

 - Return IO to ML if BD goes more then 255 after split.

 - Correct the size of each BD split to 0xffff.

Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Do not allow both a cleanup completion and abort completion for the...
Saurav Kashyap [Mon, 24 Jun 2019 08:29:58 +0000 (01:29 -0700)]
scsi: bnx2fc: Do not allow both a cleanup completion and abort completion for the same request

If firmware sends either cleanup or abort completion, it means other won't
be sent. Clean out flags for other as well.

Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Separate out completion flags and variables for abort and cleanup
Saurav Kashyap [Mon, 24 Jun 2019 08:29:57 +0000 (01:29 -0700)]
scsi: bnx2fc: Separate out completion flags and variables for abort and cleanup

Separate out abort and cleanup flag and completion, to have better
understaning of what is getting processed.

Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Only put reference to io_req in bnx2fc_abts_cleanup if cleanup times out
Chad Dupuis [Mon, 24 Jun 2019 08:29:56 +0000 (01:29 -0700)]
scsi: bnx2fc: Only put reference to io_req in bnx2fc_abts_cleanup if cleanup times out

In certain tests where the SCSI error handler issues an abort that is
already outstanding, we will cleanup the command so that the SCSI error
handler can proceed.  In some of these cases we were seeing a command
mismatch:

 kernel: scsi host2: bnx2fc: xid:0x42b eh_abort - refcnt = 2
 kernel: bnx2fc: eh_abort: io_req (xid = 0x42b) already in abts processing
 kernel: scsi host2: bnx2fc: xid:0x42b Entered bnx2fc_initiate_cleanup
 kernel: scsi host2: bnx2fc: xid:0x42b CLEANUP io_req xid = 0x80b
 kernel: scsi host2: bnx2fc: xid:0x80b cq_compl- cleanup resp rcvd
 kernel: scsi host2: bnx2fc: xid:0x42b complete - rx_state = 9
 kernel: scsi host2: bnx2fc: xid:0x42b Entered process_cleanup_compl refcnt = 2, cmd_type = 1
 kernel: scsi host2: bnx2fc: xid:0x42b scsi_done. err_code = 0x7
 kernel: scsi host2: bnx2fc: xid:0x42b sc=ffff8807f93dfb80, result=0x7, retries=0, allowed=5
 kernel: ------------[ cut here ]------------
 kernel: WARNING: at /root/rpmbuild/BUILD/netxtreme2-7.14.43/obj/default/bnx2fc-2.12.1/driver/bnx2fc_io.c:1347 bnx2fc_eh_abort+0x56f/0x680 [bnx2fc]()
 kernel: xid=0x42b refcount=-1
 kernel: Modules linked in:
 kernel: nls_utf8 isofs sr_mod cdrom tcp_lp dm_round_robin xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge ebtable_filter ebtables fuse ip6table_filter ip6_tables iptable_filter bnx2fc(OE) cnic(OE) uio fcoe libfcoe 8021q libfc garp mrp scsi_transport_fc stp llc scsi_tgt vfat fat dm_service_time intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ses enclosure ipmi_ssif i2c_core hpilo hpwdt wmi sg ipmi_devintf pcspkr ipmi_si ipmi_msghandler shpchp acpi_power_meter dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs sd_mod crc_t10dif
 kernel: crct10dif_generic bnx2x(OE) crct10dif_pclmul crct10dif_common crc32c_intel mdio ptp pps_core libcrc32c smartpqi scsi_transport_sas fjes uas usb_storage dm_mirror dm_region_hash dm_log dm_mod
 kernel: CPU: 9 PID: 2012 Comm: scsi_eh_2 Tainted: G        W  OE  ------------   3.10.0-514.el7.x86_64 #1
 kernel: Hardware name: HPE Synergy 480 Gen10/Synergy 480 Gen10 Compute Module, BIOS I42 03/21/2018
 kernel: ffff8807f25a3d98 0000000015e7fa0c ffff8807f25a3d50 ffffffff81685eac
 kernel: ffff8807f25a3d88 ffffffff81085820 ffff8807f8e39000 ffff880801ff7468
 kernel: ffff880801ff7610 0000000000002002 ffff8807f8e39014 ffff8807f25a3df0
 kernel: Call Trace:
 kernel: [<ffffffff81685eac>] dump_stack+0x19/0x1b
 kernel: [<ffffffff81085820>] warn_slowpath_common+0x70/0xb0
 kernel: [<ffffffff810858bc>] warn_slowpath_fmt+0x5c/0x80
 kernel: [<ffffffff8168d842>] ? _raw_spin_lock_bh+0x12/0x50
 kernel: [<ffffffffa0549e6f>] bnx2fc_eh_abort+0x56f/0x680 [bnx2fc]
 kernel: [<ffffffff814570af>] scsi_error_handler+0x59f/0x8b0
 kernel: [<ffffffff81456b10>] ? scsi_eh_get_sense+0x250/0x250
 kernel: [<ffffffff810b052f>] kthread+0xcf/0xe0
 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
 kernel: [<ffffffff81696418>] ret_from_fork+0x58/0x90
 kernel: [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
 kernel: ---[ end trace 42deb88f2032b111 ]---

The reason that there was a mismatch is that the SCSI command is actual
returned from the cleanup handler.  In previous testing, the type of
cleanup notification we'd get from the CQE did not trigger the code that
returned the SCSI command.  To overcome the previous behavior we would put
a reference in bnx2fc_abts_cleanup() to account for the SCSI command.
However, in cases where the SCSI command is actually off, we end up with an
extra put.

The fix for this is to only take the extra put in bnx2fc_abts_cleanup if
the completion for the cleanup times out.

Signed-off-by: Chad Dupuis <cdupuis@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: bnx2fc: Redo setting source FCoE MAC
Chad Dupuis [Mon, 24 Jun 2019 08:29:55 +0000 (01:29 -0700)]
scsi: bnx2fc: Redo setting source FCoE MAC

For bnx2fc, the source FCoE MAC is stored in the fcoe_port struct in the
data_src_mac field.  Currently this is set in fcoe_ctlr_recv_flogi which
ends up setting it by simply using fc_fcoe_set_mac() which only uses the
default FCF-MAC.  We still want to store the source FCoE MAC in
port->data_src_mac but we want to snoop the FLOGI response payload so as to
set it in the following method:

1. If a granted_mac is found, use that.

2. If not granted_mac is there but there is a FCF-MAP from the FCF then
   create the MAC from the FCF-MAP and the destination ID from the frame.

3. If there is no FCF-MAP the use the spec. default FCF-MAP and the
   destination ID from the frame.

Signed-off-by: Chad Dupuis <cdupuis@marvell.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufshdc-pci: Add Intel PCI IDs for EHL
Adrian Hunter [Fri, 21 Jun 2019 12:19:42 +0000 (15:19 +0300)]
scsi: ufshdc-pci: Add Intel PCI IDs for EHL

Add more Intel PCI IDs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs-bsg: complete ufs-bsg job only if no error
Bean Huo [Sun, 23 Jun 2019 17:38:56 +0000 (17:38 +0000)]
scsi: ufs-bsg: complete ufs-bsg job only if no error

In the case of UPIU/DME request execution failed in UFS device,
ufs_bsg_request() will complete the failed bsg job by calling
bsg_job_done(). Meanwhile, it returns this error status to blk-mq layer,
then triggers blk-mq completing this request again, this will cause the
following panic.

Call trace:
ll_sc___cmpxchg_case_acq_32+0x4/0x20
complete+0x28/0x70
blk_end_sync_rq+0x24/0x30
blk_mq_end_request+0xb8/0x118
bsg_job_put+0x4c/0x58
bsg_complete+0x20/0x30
blk_done_softirq+0xb4/0xe8
do_softirq+0x154/0x3f0
run_ksoftirqd+0x4c/0x68
smpboot_thread_fn+0x22c/0x268
kthread+0x130/0x138
ret_from_fork+0x10/0x1c
Code: f84107fe d65f03c0 d503201f f9800011 (885ffc10)
---[ end trace d92825bff6326e66 ]---
Kernel panic - not syncing: Fatal exception in interrupt

This patch is to fix this issue. The solution is to complete the ufs-bsg
job only if no error happened.

[mkp: commit description tweak]

Fixes: df032bf27a41 (scsi: ufs: Add a bsg endpoint that supports UPIUs)
Signed-off-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs-bsg: fix typo in ufs_bsg_request
Bean Huo [Sun, 23 Jun 2019 17:38:39 +0000 (17:38 +0000)]
scsi: ufs-bsg: fix typo in ufs_bsg_request

Correct dev_dbg to dev_err, so as to print out the error information in
case of DME command failed.

Signed-off-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: virtio_scsi: remove unused 'affinity_hint_set'
Dongli Zhang [Wed, 19 Jun 2019 07:52:19 +0000 (15:52 +0800)]
scsi: virtio_scsi: remove unused 'affinity_hint_set'

The 'affinity_hint_set' is not used any longer since commit
0d9f0a52c8b9 ("virtio_scsi: use virtio IRQ affinity").

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: use DEVICE_ATTR_{RO, RW}
Tomas Henzl [Fri, 14 Jun 2019 14:41:44 +0000 (16:41 +0200)]
scsi: mpt3sas: use DEVICE_ATTR_{RO, RW}

Use existing macros.  No functional change.

[mkp: typo]

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: make driver options visible in sys
Tomas Henzl [Fri, 14 Jun 2019 14:41:43 +0000 (16:41 +0200)]
scsi: mpt3sas: make driver options visible in sys

Support is easier with all driver parameters visible in sysfs.  Also I've
replaced a constant with an octal permission.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs-qcom: Add support for platforms booting ACPI
Lee Jones [Mon, 17 Jun 2019 11:54:54 +0000 (12:54 +0100)]
scsi: ufs-qcom: Add support for platforms booting ACPI

New Qualcomm AArch64 based laptops are now available which use UFS as their
primary data storage medium.  These devices are supplied with ACPI support
out of the box.  This patch ensures the Qualcomm UFS driver will be bound
when the "QCOM24A5" H/W device is advertised as present.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: Use struct_size() helper
Gustavo A. R. Silva [Fri, 7 Jun 2019 18:40:53 +0000 (13:40 -0500)]
scsi: megaraid_sas: Use struct_size() helper

One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with
memory for some number of elements for that array. For example:

struct MR_PD_CFG_SEQ_NUM_SYNC {
...
        struct MR_PD_CFG_SEQ seq[1];
} __packed;

Make use of the struct_size() helper instead of an open-coded version in
order to avoid any potential type mistakes.

So, replace the following form:

sizeof(struct MR_PD_CFG_SEQ_NUM_SYNC) + (sizeof(struct MR_PD_CFG_SEQ) * (MAX_PHYSICAL_DEVICES - 1))

with:

struct_size(pd_sync, seq, MAX_PHYSICAL_DEVICES - 1)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mac_scsi: Treat Last Byte Sent time-out as failure
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: mac_scsi: Treat Last Byte Sent time-out as failure

A system bus error during a PDMA send operation can result in bytes being
lost. Theoretically that could cause the target to remain in DATA OUT phase
and the initiator (expecting a phase change) would time-out waiting for the
Last Byte Sent flag. Should that happen, fail the transfer so the core
driver will stop using PDMA with this target.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mac_scsi: Enable PDMA on Mac IIfx
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: mac_scsi: Enable PDMA on Mac IIfx

Add support for Apple's custom "SCSI DMA" chip. This patch doesn't make use
of its DMA capability. Just the PDMA capability is sufficient to improve
sequential read throughput by a factor of 5.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: Joshua Thompson <funaho@jurai.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mac_scsi: Fix pseudo DMA implementation, take 2
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: mac_scsi: Fix pseudo DMA implementation, take 2

A system bus error during a PDMA transfer can mess up the calculation of
the transfer residual (the PDMA handshaking hardware lacks a byte
counter). This results in data corruption.

The algorithm in this patch anticipates a bus error by starting each
transfer with a MOVE.B instruction. If a bus error is caught the transfer
will be retried. If a bus error is caught later in the transfer (for a
MOVE.W instruction) the transfer gets failed and subsequent requests for
that target will use PIO instead of PDMA.

This avoids the "!REQ and !ACK" error so the severity level of that message
is reduced to KERN_DEBUG.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org # v4.14+
Fixes: 3a0f64bfa907 ("mac_scsi: Fix pseudo DMA implementation")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Reported-by: Chris Jones <chris@martin-jones.com>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mac_scsi: Increase PIO/PDMA transfer length threshold
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: mac_scsi: Increase PIO/PDMA transfer length threshold

Some targets introduce delays when handshaking the response to certain
commands. For example, a disk may send a 96-byte response to an INQUIRY
command (or a 24-byte response to a MODE SENSE command) too slowly.

Apparently the first 12 or 14 bytes are handshaked okay but then the system
bus error timeout is reached while transferring the next word.

Since the scsi bus phase hasn't changed, the driver then sets the target
borken flag to prevent further PDMA transfers. The driver also logs the
warning, "switching to slow handshake".

Raise the PDMA threshold to 512 bytes so that PIO transfers will be used
for these commands. This default is sufficiently low that PDMA will still
be used for READ and WRITE commands.

The existing threshold (16 bytes) was chosen more or less at random.
However, best performance requires the threshold to be as low as possible.
Those systems that don't need the PIO workaround at all may benefit from
mac_scsi.setup_use_pdma=1

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@vger.kernel.org # v4.14+
Fixes: 3a0f64bfa907 ("mac_scsi: Fix pseudo DMA implementation")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: NCR5380: Handle PDMA failure reliably
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: NCR5380: Handle PDMA failure reliably

A PDMA error is handled in the core driver by setting the device's 'borken'
flag and aborting the command. Unfortunately, do_abort() is not
dependable. Perform a SCSI bus reset instead, to make sure that the command
fails and gets retried.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@vger.kernel.org # v4.20+
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: NCR5380: Always re-enable reselection interrupt
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
scsi: NCR5380: Always re-enable reselection interrupt

The reselection interrupt gets disabled during selection and must be
re-enabled when hostdata->connected becomes NULL. If it isn't re-enabled a
disconnected command may time-out or the target may wedge the bus while
trying to reselect the host. This can happen after a command is aborted.

Fix this by enabling the reselection interrupt in NCR5380_main() after
calls to NCR5380_select() and NCR5380_information_transfer() return.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@vger.kernel.org # v4.9+
Fixes: 8b00c3d5d40d ("ncr5380: Implement new eh_abort_handler")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoRevert "scsi: ncr5380: Increase register polling limit"
Finn Thain [Sun, 9 Jun 2019 01:19:11 +0000 (11:19 +1000)]
Revert "scsi: ncr5380: Increase register polling limit"

This reverts commit 4822827a69d7cd3bc5a07b7637484ebd2cf88db6.

The purpose of that commit was to suppress a timeout warning message which
appeared to be caused by target latency. But suppressing the warning is
undesirable as the warning may indicate a messed up transfer count.

Another problem with that commit is that 15 ms is too long to keep
interrupts disabled as interrupt latency can cause system clock drift and
other problems.

Cc: Michael Schmitz <schmitzmic@gmail.com>
Cc: stable@vger.kernel.org
Fixes: 4822827a69d7 ("scsi: ncr5380: Increase register polling limit")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Tested-by: Stan Johnson <userm57@yahoo.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: wd719x: Fix resets and aborts
Ondrej Zary [Mon, 17 Jun 2019 17:50:12 +0000 (19:50 +0200)]
scsi: wd719x: Fix resets and aborts

Host reset oopses because it calls wd719x_chip_init, which calls
request_firmware, under a spinlock. Stop the RISC first, then flush active
SCBs under a spinlock. Finally call wd719x_chip_init unlocked.

Also found and fixed more bugs during tests:

Affected active SCBs were not flushed during abort, bus and device
reset. This caused problems in a following host reset (hang or oops).

Device and bus reset failed under load because the result of the reset
command is WD719X_SUE_TERM or WD719X_SUE_RESET. Don't treat these codes as
error in wd719x_wait_done.

wd719x_direct_cmd for RESET/ABORT commands didn't work properly, causing
timeouts. Looks like it was caused by the WD719X_DISABLE_INT bit. Not
setting it for RESET/ABORT commands seems to fix the probem.  Also lower
the log level of the corresponding "direct command completed" message to
debug.

Unfortunately, my documentation is missing some pages, including page
67 (SPIDER67.gif) about resets :(

Reported-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Signed-off-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: RDMA/srp: Fix a sleep-in-invalid-context bug
Bart Van Assche [Mon, 17 Jun 2019 15:18:20 +0000 (08:18 -0700)]
scsi: RDMA/srp: Fix a sleep-in-invalid-context bug

The previous patch guarantees that srp_queuecommand() does not get
invoked while reconnecting occurs. Hence remove the code from
srp_queuecommand() that prevents command queueing while reconnecting.
This patch avoids that the following can appear in the kernel log:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 1, irqs_disabled(): 0, pid: 5600, name: scsi_eh_9
1 lock held by scsi_eh_9/5600:
 #0:  (rcu_read_lock){....}, at: [<00000000cbb798c7>] __blk_mq_run_hw_queue+0xf1/0x1e0
Preemption disabled at:
[<00000000139badf2>] __blk_mq_delay_run_hw_queue+0x78/0xf0
CPU: 9 PID: 5600 Comm: scsi_eh_9 Tainted: G        W        4.15.0-rc4-dbg+ #1
Hardware name: Dell Inc. PowerEdge R720/0VWT90, BIOS 2.5.4 01/22/2016
Call Trace:
 dump_stack+0x67/0x99
 ___might_sleep+0x16a/0x250 [ib_srp]
 __mutex_lock+0x46/0x9d0
 srp_queuecommand+0x356/0x420 [ib_srp]
 scsi_dispatch_cmd+0xf6/0x3f0
 scsi_queue_rq+0x4a8/0x5f0
 blk_mq_dispatch_rq_list+0x73/0x440
 blk_mq_sched_dispatch_requests+0x109/0x1a0
 __blk_mq_run_hw_queue+0x131/0x1e0
 __blk_mq_delay_run_hw_queue+0x9a/0xf0
 blk_mq_run_hw_queue+0xc0/0x1e0
 blk_mq_start_hw_queues+0x2c/0x40
 scsi_run_queue+0x18e/0x2d0
 scsi_run_host_queues+0x22/0x40
 scsi_error_handler+0x18d/0x5f0
 kthread+0x11c/0x140
 ret_from_fork+0x24/0x30

Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: Avoid that .queuecommand() gets called for a blocked SCSI device
Bart Van Assche [Mon, 17 Jun 2019 15:18:19 +0000 (08:18 -0700)]
scsi: Avoid that .queuecommand() gets called for a blocked SCSI device

Several SCSI transport and LLD drivers surround code that does not
tolerate concurrent calls of .queuecommand() with scsi_target_block() /
scsi_target_unblock(). These last two functions use
blk_mq_quiesce_queue() / blk_mq_unquiesce_queue() for scsi-mq request
queues to prevent concurrent .queuecommand() calls. However, that is
not sufficient to prevent .queuecommand() calls from scsi_send_eh_cmnd().
Hence surround the .queuecommand() call from the SCSI error handler with
code that avoids that .queuecommand() gets called in the blocked state.

Note: converting the .queuecommand() call in scsi_send_eh_cmnd() into
code that calls blk_get_request() + blk_execute_rq() is not an option
since scsi_send_eh_cmnd() must be able to make forward progress even
if all requests have been allocated.

Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: Restrict user space SCSI device state changes to "running" and "offline"
Bart Van Assche [Mon, 17 Jun 2019 15:18:18 +0000 (08:18 -0700)]
scsi: Restrict user space SCSI device state changes to "running" and "offline"

The ability to modify the SCSI device state was introduced by commit
638127e579a4 ("[PATCH] Fix error handler offline behaviour"; v2.6.12). That
same commit introduced the following device states:

       { SDEV_CREATED, "created" },
       { SDEV_RUNNING, "running" },
       { SDEV_CANCEL,  "cancel"  },
       { SDEV_DEL,     "deleted" },
       { SDEV_QUIESCE, "quiesce" },
       { SDEV_OFFLINE, "offline" },

The SDEV_BLOCK state was introduced later to avoid that an FC cable pull
would immediately result in an I/O error (commit 1094e682310e; "[PATCH]
suspending I/Os to a device"; v2.6.12). That same patch introduced the
ability to set the SDEV_BLOCK state from user space. I'm not sure whether
that ability was introduced on purpose or accidentally.

Since there is agreement that only writing "running" or "offline" into
the SCSI sysfs device state attribute makes sense, restrict sysfs writes
to these values.

This patch makes sure that SDEV_BLOCK is only used for its original
purpose, namely to allow transport drivers and LLDs to block further
.queuecommand() calls while transport layer or adapter recovery is in
progress.

Note: a web search for "/sys/class/scsi_device" AND "device/state"
revealed several storage configuration guides. The instructions I found
in these guides tell users to write the value "running" or "offline" in
the SCSI device state sysfs attribute and no other values.

[mkp: typo]

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: James Smart <james.smart@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: cxgb4i: add support for IEEE_8021QAZ_APP_SEL_STREAM selector
Varun Prakash [Mon, 17 Jun 2019 13:16:26 +0000 (18:46 +0530)]
scsi: cxgb4i: add support for IEEE_8021QAZ_APP_SEL_STREAM selector

IEEE_8021QAZ_APP_SEL_STREAM is a valid selector for iSCSI connections, so
add code to use IEEE_8021QAZ_APP_SEL_STREAM selector to get priority mask.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: tcmu: Simplify tcmu_update_uio_info()
Christophe JAILLET [Sun, 16 Jun 2019 07:02:20 +0000 (09:02 +0200)]
scsi: tcmu: Simplify tcmu_update_uio_info()

Use 'kasprintf()' instead of:
   - snprintf(NULL, 0...
   - kmalloc(...
   - snprintf(...

This is less verbose and saves 7 bytes (i.e. the space for '/(null)') if
'udev->dev_config' is NULL.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: storvsc: Add ability to change scsi queue depth
Branden Bonaby [Fri, 14 Jun 2019 23:48:22 +0000 (19:48 -0400)]
scsi: storvsc: Add ability to change scsi queue depth

Adding functionality to allow the SCSI queue depth to be changed by
utilizing the "scsi_change_queue_depth" function.

[mkp: checkpatch]

Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Mark expected switch fall-through
Gustavo A. R. Silva [Tue, 11 Jun 2019 15:02:19 +0000 (10:02 -0500)]
scsi: mpt3sas: Mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases where
we are expecting to fall through.

This patch fixes the following warning:

drivers/scsi/mpt3sas/mpt3sas_base.c: In function  _base_update_ioc_page1_inlinewith_perf_mode :
drivers/scsi/mpt3sas/mpt3sas_base.c:4510:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (ioc->high_iops_queues) {
      ^
drivers/scsi/mpt3sas/mpt3sas_base.c:4530:2: note: here
  case MPT_PERF_MODE_LATENCY:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.

Fixes: 30cb97023f38 ("scsi: mpt3sas: Introduce perf_mode module parameter")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander()
John Garry [Mon, 10 Jun 2019 12:41:41 +0000 (20:41 +0800)]
scsi: libsas: aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander()

Many times in libsas, and in LLDDs which use libsas, the check for an
expander device is re-implemented or open coded.

Use dev_is_expander() instead. We rename this from
sas_dev_type_is_expander() to not spill so many lines in referencing.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Update driver version to 29.100.00.00
Suganath Prabu S [Fri, 31 May 2019 12:14:43 +0000 (08:14 -0400)]
scsi: mpt3sas: Update driver version to 29.100.00.00

Update driver version from 28.100.00.00 to 29.100.00.00
This is equivalent to Phase 10 OOB driver.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Introduce perf_mode module parameter
Suganath Prabu S [Fri, 31 May 2019 12:14:42 +0000 (08:14 -0400)]
scsi: mpt3sas: Introduce perf_mode module parameter

1. Introduce module parameter perf_mode for only Aero/Sea generation HBAs.

2. Update IOC page1 fields according to performance mode.

Below are the performance modes that can be enabled with module parameter
perf_mode:

 0: Balanced - Few high iops reply queues will be enabled.  Interrupt
    coalescing will be enabled only for these high iops reply descriptor
    queues.

 1: Iops - Interrupt coalescing will be enabled on all reply queues.
    Coalescing timeout is set to 0x20.This is default value for Aero.

 2: Latency - Interrupt coalescing will be enabled on all reply queues.
    Coalescing timeout is set to 0xA.  This is a legacy behavior similar to
    Ventura & Invader HBA series.

Default perf mode set by driver will be balanced mode if the following
conditions are met:

 - CPU vendor = Intel;
 - Aero controller working in 16GT/s pcie speed

Performance mode will be set to latency mode for all other cases.

4k Random Read IO performance numbers on 24 SAS SSD drives for above three
permormance modes. Performance data is from Intel Skylake and HGST SS300
(drive model SDLL1DLR400GCCA1).

IOPs:
 -----------------------------------------------------------------------
  |perf_mode    | qd = 1 | qd = 64 |   note                             |
  |-------------|--------|---------|-------------------------------------
  |balanced     |  259K  |  3061k  | Provides max performance numbers   |
  |             |        |         | both on lower QD workload &        |
  |             |        |         | also on higher QD workload         |
  |-------------|--------|---------|-------------------------------------
  |iops         |  220K  |  3100k  | Provides max performance numbers   |
  |             |        |         | only on higher QD workload.        |
  |-------------|--------|---------|-------------------------------------
  |latency      |  246k  |  2226k  | Provides good performance numbers  |
  |             |        |         | only on lower QD worklaod.         |
  -----------------------------------------------------------------------

Avarage Latency:
  -----------------------------------------------------
  |perf_mode    |  qd = 1      |    qd = 64           |
  |-------------|--------------|----------------------|
  |balanced     |  92.05 usec  |    501.12 usec       |
  |-------------|--------------|----------------------|
  |iops         |  108.40 usec |    498.10 usec       |
  |-------------|--------------|----------------------|
  |latency      |  97.10 usec  |    689.26 usec       |
  -----------------------------------------------------

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Enable interrupt coalescing on high iops
Suganath Prabu S [Fri, 31 May 2019 12:14:41 +0000 (08:14 -0400)]
scsi: mpt3sas: Enable interrupt coalescing on high iops

Enable interrupt coalescing only on high iops queues.

In ioc config page 1, offset 0x14 (ProductSpecific field) is used to
determine interrupt coalescing enabled/disabled on per reply descriptor
post queue group(8) basis.  If 31st bit is zero, then interrupt coalescing
is enabled for all reply descriptor post queues. If 31st bit is set to one,
then user can enable/disable interrupt coalescing on per reply descriptor
post queue group(8) basis. So to enable interrupt coalescing only on first
reply descriptor post queue group (i.e. on high iops queues), set bit 0 and
31.

This configuration should reset during driver unload or shutdown to the
default settings. For this, the driver takes copy of default ioc page 1 and
copies back the default or unmodified ioc page1 during unload and
shutdown. This means that on next driver load (e.g. if older version driver
is loaded by user), current modified changes on ioc page1 won't take
effect.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Affinity high iops queues IRQs to local node
Suganath Prabu S [Fri, 31 May 2019 12:14:40 +0000 (08:14 -0400)]
scsi: mpt3sas: Affinity high iops queues IRQs to local node

High iops queues are mapped to non-managed irqs. Set affinity of
non-managed irqs to local numa node.  Low latency queues are mapped to
managed irqs.

Driver reserves some reply queues for max iops (through
pci_alloc_irq_vectors_affinity and .pre_vectors interface). The rest of
queues are for low latency.

Based on io workload in io submission path, driver will decide which group
of reply queues (either high iops queues or low latency queues) to be
used. High iops queues will be mapped to local numa node of controller and
low latency queues will be mapped to cpus across numa nodes. In general,
high iops and low latency queues should fit into 128 reply queues
which is the max number of reply queues supported by Aero/Sea.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: save and use MSI-X index for posting RD
Suganath Prabu S [Fri, 31 May 2019 12:14:39 +0000 (08:14 -0400)]
scsi: mpt3sas: save and use MSI-X index for posting RD

In the IO submission path _base_get_msix_index is called twice. Initially
while getting the smid and subsequently while posting the request
descriptor (RD).

Refactor code to query msix index only while posting the request
descriptor. Save determined msix index in msix_io field.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Use high iops queues under some circumstances
Suganath Prabu S [Fri, 31 May 2019 12:14:38 +0000 (08:14 -0400)]
scsi: mpt3sas: Use high iops queues under some circumstances

The driver will use round-robin method for io submission in batches within
the high iops queues when the number of in-flight ios on the target device
is larger than 8. Otherwise the driver will use low latency reply queues.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: change _base_get_msix_index prototype
Suganath Prabu S [Fri, 31 May 2019 12:14:37 +0000 (08:14 -0400)]
scsi: mpt3sas: change _base_get_msix_index prototype

Code refactoring.

In function _base_get_msix_index, add scmd as second argument. This change
is made in preparation for the next patch where we introduce a new function
to get the MSI-X index for high iops queues.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Add flag high_iops_queues
Suganath Prabu S [Fri, 31 May 2019 12:14:36 +0000 (08:14 -0400)]
scsi: mpt3sas: Add flag high_iops_queues

Aero controllers support balanced performance mode through the ability to
configure queues with different properties.

Reply queues with interrupt coalescing enabled are called "high iops reply
queues" and reply queues with interrupt coalescing disabled are called "low
latency reply queues".

The driver configures a combination of high iops and low latency reply
queues if:

 - HBA is an AERO controller;

 - MSI-X vectors supported by the HBA is 128;

 - Total CPU count in the system more than high iops queue count;

 - Driver is loaded with default max_msix_vectors module parameter; and

 - System booted in non-kdump mode.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: Add Atomic RequestDescriptor support on Aero
Suganath Prabu S [Fri, 31 May 2019 12:14:35 +0000 (08:14 -0400)]
scsi: mpt3sas: Add Atomic RequestDescriptor support on Aero

If the Aero HBA supports Atomic Request Descriptors, it sets the Atomic
Request Descriptor Capable bit in the IOCCapabilities field of the IOCFacts
Reply message. Driver uses an Atomic Request Descriptor as an alternative
method for posting an entry onto a request queue.

The posting of an Atomic Request Descriptor is an atomic operation,
providing a safe mechanism for multiple processors on the host to post
requests without synchronization. This Atomic Request Descriptor format is
identical to first 32 bits of Default Request Descriptor and uses only 32
bits.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas: function pointers of request descriptor
Suganath Prabu S [Fri, 31 May 2019 12:14:34 +0000 (08:14 -0400)]
scsi: mpt3sas: function pointers of request descriptor

This code refactoring introduces function pointers.

Host uses Request Descriptors of different types for posting an entry onto
a request queue. Based on controller type and capabilities, host can also
use atomic descriptors other than normal descriptors.  Using function
pointer will avoid if-else statements

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: isci: Grammar s/the its/its/
Geert Uytterhoeven [Fri, 7 Jun 2019 11:34:26 +0000 (13:34 +0200)]
scsi: isci: Grammar s/the its/its/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: aic7xxx: Spelling s/configuraion/configuration/
Geert Uytterhoeven [Fri, 7 Jun 2019 11:27:36 +0000 (13:27 +0200)]
scsi: aic7xxx: Spelling s/configuraion/configuration/

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: Remove unused including <linux/version.h>
YueHaibing [Sat, 1 Jun 2019 03:18:06 +0000 (03:18 +0000)]
scsi: megaraid_sas: Remove unused including <linux/version.h>

Remove including <linux/version.h> that don't need it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: use DEVICE_ATTR_{RO, RW}
Tomas Henzl [Wed, 29 May 2019 16:00:41 +0000 (18:00 +0200)]
scsi: megaraid_sas: use DEVICE_ATTR_{RO, RW}

Use existing macros.  No functional change.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: use octal permissions instead of constants
Tomas Henzl [Wed, 29 May 2019 16:00:40 +0000 (18:00 +0200)]
scsi: megaraid_sas: use octal permissions instead of constants

Checkpatch emits a warning when using symbolic permissions. Use octal
permissions instead.  No functional change.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: make max_sectors visible in sys
Tomas Henzl [Wed, 29 May 2019 16:00:39 +0000 (18:00 +0200)]
scsi: megaraid_sas: make max_sectors visible in sys

Support is easier with all driver parameters visible in sysfs.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: remove set but not used variables 'buff_addr' and 'ci_h'
YueHaibing [Sat, 25 May 2019 12:40:06 +0000 (20:40 +0800)]
scsi: megaraid_sas: remove set but not used variables 'buff_addr' and 'ci_h'

Fixes gcc '-Wunused-but-set-variable' warnings:

drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_fw_crash_buffer_show:
drivers/scsi/megaraid/megaraid_sas_base.c:3138:16: warning: variable buff_addr set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_get_pd_list:
drivers/scsi/megaraid/megaraid_sas_base.c:4426:13: warning: variable ci_h set but not used [-Wunused-but-set-variable]

'buff_addr' is never used since inroduction in commit fc62b3fc9021
("megaraid_sas : Firmware crash dump feature support")

'ci_h' is not used since commit 9b3d028f3468 ("scsi: megaraid_sas:
Pre-allocate frequently used DMA buffers")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: remove set but not used variable 'sge_sz'
YueHaibing [Sat, 25 May 2019 12:37:05 +0000 (20:37 +0800)]
scsi: megaraid_sas: remove set but not used variable 'sge_sz'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_create_frame_pool:
drivers/scsi/megaraid/megaraid_sas_base.c:4124:6: warning: variable sge_sz set but not used [-Wunused-but-set-variable]

It's not used any more since commit 200aed582d61 ("megaraid_sas: endianness
related bug fixes and code optimization")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Avoid unused function warnings
Nathan Chancellor [Thu, 6 Jun 2019 05:24:21 +0000 (22:24 -0700)]
scsi: lpfc: Avoid unused function warnings

When building powerpc pseries_defconfig or powernv_defconfig:

drivers/scsi/lpfc/lpfc_nvmet.c:224:1: error: unused function
'lpfc_nvmet_get_ctx_for_xri' [-Werror,-Wunused-function]
drivers/scsi/lpfc/lpfc_nvmet.c:246:1: error: unused function
'lpfc_nvmet_get_ctx_for_oxid' [-Werror,-Wunused-function]

These functions are only compiled when CONFIG_NVME_TARGET_FC is enabled.
Use that same condition so there is no more warning. While the fixes commit
did not introduce these functions, it caused these warnings.

Fixes: 4064b27417a7 ("scsi: lpfc: Make some symbols static")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: MAINTAINERS: update maintainer for PM8001
Jack Wang [Thu, 6 Jun 2019 15:33:05 +0000 (17:33 +0200)]
scsi: MAINTAINERS: update maintainer for PM8001

Lindar's email addess is bouncing for some time, just remove it.

ProfitBricks was rebranded to 1 & 1 Cloud IONOS, so update my email address
too.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ibmvscsi: Don't use rc uninitialized in ibmvscsi_do_work
Nathan Chancellor [Mon, 3 Jun 2019 23:44:06 +0000 (16:44 -0700)]
scsi: ibmvscsi: Don't use rc uninitialized in ibmvscsi_do_work

clang warns:

drivers/scsi/ibmvscsi/ibmvscsi.c:2126:7: warning: variable 'rc' is used
uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
        case IBMVSCSI_HOST_ACTION_NONE:
             ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/ibmvscsi/ibmvscsi.c:2151:6: note: uninitialized use occurs
here
        if (rc) {
            ^~

Initialize rc in the IBMVSCSI_HOST_ACTION_UNBLOCK case statement then
shuffle IBMVSCSI_HOST_ACTION_NONE down to the default case statement and
make it return early so that rc is never used uninitialized in this
function.

Fixes: 035a3c4046b5 ("scsi: ibmvscsi: redo driver work thread to use enum action states")
Link: https://github.com/ClangBuiltLinux/linux/issues/502
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Suggested-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Make some symbols static
YueHaibing [Fri, 31 May 2019 15:28:41 +0000 (23:28 +0800)]
scsi: lpfc: Make some symbols static

Fix sparse warnings:

drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Remove set but not used variables 'qp'
YueHaibing [Fri, 31 May 2019 15:27:45 +0000 (23:27 +0800)]
scsi: lpfc: Remove set but not used variables 'qp'

Fixes gcc '-Wunused-but-set-variable' warnings:

drivers/scsi/lpfc/lpfc_init.c: In function lpfc_setup_cq_lookup:
drivers/scsi/lpfc/lpfc_init.c:9359:30: warning: variable qp set but not used [-Wunused-but-set-variable]

It's not used since commit e70596a60f88 ("scsi: lpfc: Fix poor use of
hardware queues if fewer irq vectors")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: qla2xxx: remove double assignment in qla2x00_update_fcport
Enzo Matsumiya [Tue, 7 May 2019 15:39:05 +0000 (12:39 -0300)]
scsi: qla2xxx: remove double assignment in qla2x00_update_fcport

Remove double assignment in qla2x00_update_fcport().

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Disable stash for v3 hw
Xiang Chen [Wed, 29 May 2019 09:58:47 +0000 (17:58 +0800)]
scsi: hisi_sas: Disable stash for v3 hw

For v3 hw, stash is enabled to promote performance, but it does little to
improve performance according to current tests. What's more, it causes
exceptions for some situations, so disable it.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Ignore the error code between phy down to phy up
Luo Jiaxing [Wed, 29 May 2019 09:58:46 +0000 (17:58 +0800)]
scsi: hisi_sas: Ignore the error code between phy down to phy up

Several error codes will be generated between PHY down to up.

This issue was introduced by HW design. The designers came to the
conclusion that we should ignore these errors.

Signed-off-by: Jiaxing Luo <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Change the type of some numbers to unsigned
Xiang Chen [Wed, 29 May 2019 09:58:45 +0000 (17:58 +0800)]
scsi: hisi_sas: Change the type of some numbers to unsigned

It reports a error as follows from some tools at two places in our code:
runtime error: left shift of 4 by 29 places cannot be represented in type
'int' So change the type of the two numbers to unsigned to avoid the error.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Reduce HISI_SAS_SGE_PAGE_CNT in size
John Garry [Wed, 29 May 2019 09:58:44 +0000 (17:58 +0800)]
scsi: hisi_sas: Reduce HISI_SAS_SGE_PAGE_CNT in size

Macro HISI_SAS_SGE_PAGE_CNT is defined to SG_CHUNK_SIZE, which is 128.

This means that sizeof(struct hisi_sas_slot_buf_table) is 4192. This is
just over a 4K, which can mean inefficient DMA memory usage (for no PI).

Reduce the size of HISI_SAS_SGE_PAGE_CNT to 124 to fit in a 4K page. With
this change, we experience no performance hit.

Cc: dann frazier <dann.frazier@canonical.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Fix the issue of argument mismatch of printing ecc errors
Xiaofei Tan [Wed, 29 May 2019 09:58:43 +0000 (17:58 +0800)]
scsi: hisi_sas: Fix the issue of argument mismatch of printing ecc errors

The argument of dev_err() called by multi_bit_ecc_error_process_v3_hw() is
not right. We pass two arguments, but there is only one printk format
specifier in the string.

Also move the print format string to dev_err().

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hisi_sas: Delete PHY timers when rmmod or probe failed
Xiang Chen [Wed, 29 May 2019 09:58:42 +0000 (17:58 +0800)]
scsi: hisi_sas: Delete PHY timers when rmmod or probe failed

When removing the driver or when probe fails, we need to delete the PHY
timers.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: message: fusion: Use kmemdup instead of memcpy and kmalloc
Bharath Vedartham [Wed, 22 May 2019 16:01:49 +0000 (21:31 +0530)]
scsi: message: fusion: Use kmemdup instead of memcpy and kmalloc

Replace kmalloc + memcpy with kmemdup.

This was reported by coccinelle.

Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: remove set but not used variables 'host' and 'wait_time'
YueHaibing [Sat, 25 May 2019 12:42:02 +0000 (20:42 +0800)]
scsi: megaraid_sas: remove set but not used variables 'host' and 'wait_time'

Fixes gcc '-Wunused-but-set-variable' warnings:

drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_suspend:
drivers/scsi/megaraid/megaraid_sas_base.c:7269:20: warning: variable host set but not used [-Wunused-but-set-variable]
drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_aen_polling:
drivers/scsi/megaraid/megaraid_sas_base.c:8397:15: warning: variable wait_time set but not used [-Wunused-but-set-variable]

'host' never used since introduction in commit 31ea7088974c ("[SCSI]
megaraid_sas: add hibernation support")

'wait_time' never used since commit 11c71cb4ab7c ("megaraid_sas: Do
not allow PCI access during OCR")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: megaraid_sas: remove set but not used variable 'cur_state'
YueHaibing [Sat, 25 May 2019 12:38:21 +0000 (20:38 +0800)]
scsi: megaraid_sas: remove set but not used variable 'cur_state'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/megaraid/megaraid_sas_base.c: In function megasas_transition_to_ready:
drivers/scsi/megaraid/megaraid_sas_base.c:3900:6: warning: variable cur_state set but not used [-Wunused-but-set-variable]

Never used since commit 7218df69e360 ("[SCSI] megaraid_sas: use the
firmware boot timeout when waiting for commands")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: mpt3sas_ctl: fix double-fetch bug in _ctl_ioctl_main()
Gen Zhang [Thu, 30 May 2019 01:10:30 +0000 (09:10 +0800)]
scsi: mpt3sas_ctl: fix double-fetch bug in _ctl_ioctl_main()

In _ctl_ioctl_main(), 'ioctl_header' is fetched the first time from
userspace. 'ioctl_header.ioc_number' is then checked. The legal result is
saved to 'ioc'. Then, in condition MPT3COMMAND, the whole struct is fetched
again from the userspace. Then _ctl_do_mpt_command() is called, 'ioc' and
'karg' as inputs.

However, a malicious user can change the 'ioc_number' between the two
fetches, which will cause a potential security issues.  Moreover, a
malicious user can provide a valid 'ioc_number' to pass the check in first
fetch, and then modify it in the second fetch.

To fix this, we need to recheck the 'ioc_number' in the second fetch.

Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
Acked-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs: Add error-handling of Auto-Hibernate
Stanley Chu [Tue, 21 May 2019 06:44:54 +0000 (14:44 +0800)]
scsi: ufs: Add error-handling of Auto-Hibernate

Currently auto-hibernate is activated if host supports auto-hibern8
capability. However error-handling is not implemented, which makes the
feature somewhat risky.

If either "Hibernate Enter" or "Hibernate Exit" fail during auto-hibernate
flow, the corresponding interrupt "UIC_HIBERNATE_ENTER" or
"UIC_HIBERNATE_EXIT" shall be raised according to UFS specification.

This patch adds auto-hibernate error-handling:

 - Monitor "Hibernate Enter" and "Hibernate Exit" interrupts after
   auto-hibernate feature is activated.

 - If a failure happens, trigger error-handling just like
   "manual-hibernate" failure and apply the same recovery flow: schedule
   UFS error handler in ufshcd_check_errors(), and then do host reset and
   restore in UFS error handler.

Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs: Do not overwrite Auto-Hibernate timer
Stanley Chu [Tue, 21 May 2019 06:44:53 +0000 (14:44 +0800)]
scsi: ufs: Do not overwrite Auto-Hibernate timer

Some vendor-specific initialization flow may set its own auto-hibernate
timer. In this case, do not overwrite timer value as "default value" in
ufshcd_init().

Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ufs: Introduce ufshcd_is_auto_hibern8_supported()
Stanley Chu [Tue, 21 May 2019 06:44:52 +0000 (14:44 +0800)]
scsi: ufs: Introduce ufshcd_is_auto_hibern8_supported()

The checking of Auto-Hibernation support is used in many places in the
driver, thus re-factor it as ufshcd_is_auto_hibern8_supported() to make
code more clean.

Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: libsas: no need to join wide port again in sas_ex_discover_dev()
Jason Yan [Mon, 20 May 2019 14:06:00 +0000 (22:06 +0800)]
scsi: libsas: no need to join wide port again in sas_ex_discover_dev()

Since we are processing events synchronously now, the second call of
sas_ex_join_wide_port() in sas_ex_discover_dev() is not needed. There will
be no races with other works in disco workqueue. So remove the second
sas_ex_join_wide_port().

I did not change the return value of 'res' to error when discover failed
because we need to continue to discover other phys if one phy discover
failed. So let's keep that logic as before and just add a debug log to
detect the failure. And directly return if second fanout expander attatched
to the parent expander because it has nothing to do after the phy is
disabled.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Use *_pool_zalloc rather than *_pool_alloc
Thomas Meyer [Wed, 29 May 2019 20:21:36 +0000 (22:21 +0200)]
scsi: lpfc: Use *_pool_zalloc rather than *_pool_alloc

Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: hpsa: fix an uninitialized read and dereference of pointer dev
Colin Ian King [Wed, 22 May 2019 08:39:03 +0000 (09:39 +0100)]
scsi: hpsa: fix an uninitialized read and dereference of pointer dev

Currently the check for a lockup_detected failure exits via the label
return_reset_status that reads and dereferences an uninitialized pointer
dev.  Fix this by ensuring dev is inintialized to null.

Addresses-Coverity: ("Uninitialized pointer read")
Fixes: 14991a5bade5 ("scsi: hpsa: correct device resets")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: target/iscsi: fix possible condition with no effect (if == else)
Hariprasad Kelam [Tue, 28 May 2019 01:21:52 +0000 (06:51 +0530)]
scsi: target/iscsi: fix possible condition with no effect (if == else)

Fix the following warning reported by coccicheck:

drivers/target/iscsi/iscsi_target_nego.c:175:6-8: WARNING: possible
condition with no effect (if == else)

Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: pm8001: Fix typo in code comments
Weitao Hou [Mon, 20 May 2019 03:24:03 +0000 (11:24 +0800)]
scsi: pm8001: Fix typo in code comments

Fix abord to abort.

Signed-off-by: Weitao Hou <houweitaoo@gmail.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: fdomain: Add PCMCIA support
Ondrej Zary [Mon, 27 May 2019 20:19:47 +0000 (22:19 +0200)]
scsi: fdomain: Add PCMCIA support

Add PCMCIA card support to Future Domain SCSI driver.

Tested with IBM SCSI PCMCIA Adapter 40G1890.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: fdomain: Add register definitions
Ondrej Zary [Sat, 18 May 2019 19:47:24 +0000 (21:47 +0200)]
scsi: fdomain: Add register definitions

Add register bit definitions from documentation to header file and use them
instead of magic constants. No changes to generated binary.

Signed-off-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ibmvscsi: fix tripping of blk_mq_run_hw_queue WARN_ON
Tyrel Datwyler [Fri, 3 May 2019 00:50:58 +0000 (19:50 -0500)]
scsi: ibmvscsi: fix tripping of blk_mq_run_hw_queue WARN_ON

After a successful SRP login response we call scsi_unblock_requests() to
kick any pending IOs. The callback to process this SRP response happens in
a tasklet and therefore is in softirq context. The result of such is that
when blk-mq is enabled, it is no longer safe to call scsi_unblock_requests()
from this context. The result of duing so triggers the following WARN_ON
splat in dmesg after a host reset or CRQ reenablement.

WARNING: CPU: 0 PID: 0 at block/blk-mq.c:1375 __blk_mq_run_hw_queue+0x120/0x180
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc8 #4
NIP [c0000000009771e0] __blk_mq_run_hw_queue+0x120/0x180
LR [c000000000977484] __blk_mq_delay_run_hw_queue+0x244/0x250
Call Trace:

__blk_mq_delay_run_hw_queue+0x244/0x250
blk_mq_run_hw_queue+0x8c/0x1c0
blk_mq_run_hw_queues+0x60/0x90
scsi_run_queue+0x1e4/0x3b0
scsi_run_host_queues+0x48/0x80
login_rsp+0xb0/0x100
ibmvscsi_handle_crq+0x30c/0x3e0
ibmvscsi_task+0x54/0xe0
tasklet_action_common.isra.3+0xc4/0x1a0
__do_softirq+0x174/0x3f4
irq_exit+0xf0/0x120
__do_irq+0xb0/0x210
call_do_irq+0x14/0x24
do_IRQ+0x9c/0x130
hardware_interrupt_common+0x14c/0x150

This patch fixes the issue by introducing a new host action for unblocking
the scsi requests in our seperate work thread.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ibmvscsi: redo driver work thread to use enum action states
Tyrel Datwyler [Fri, 3 May 2019 00:50:57 +0000 (19:50 -0500)]
scsi: ibmvscsi: redo driver work thread to use enum action states

The current implemenation relies on two flags in the driver's private host
structure to signal the need for a host reset or to reenable the CRQ after
a LPAR migration. This patch does away with those flags and introduces a
single action flag and defined enums for the supported kthread work
actions. Lastly, the if/else logic is replaced with a switch statement.

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: ibmvscsi: Wire up host_reset() in the driver's scsi_host_template
Tyrel Datwyler [Fri, 3 May 2019 00:50:56 +0000 (19:50 -0500)]
scsi: ibmvscsi: Wire up host_reset() in the driver's scsi_host_template

Wire up the host_reset function in our driver_template to allow a user
requested adpater reset via the host_reset sysfs attribute.

Example:

echo "adapter" > /sys/class/scsi_host/host0/host_reset

Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Update lpfc version to 12.2.0.3
James Smart [Wed, 22 May 2019 00:49:11 +0000 (17:49 -0700)]
scsi: lpfc: Update lpfc version to 12.2.0.3

Update lpfc version to 12.2.0.3

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix kernel warnings related to smp_processor_id()
James Smart [Wed, 22 May 2019 00:49:10 +0000 (17:49 -0700)]
scsi: lpfc: Fix kernel warnings related to smp_processor_id()

Kernel warnings may be seen with preempt debugging enabled.

Replace smp_processor_id calls with raw_smp_processor_id or cpu information
stored in hdwq structures.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix BFS crash with DIX enabled
James Smart [Wed, 22 May 2019 00:49:09 +0000 (17:49 -0700)]
scsi: lpfc: Fix BFS crash with DIX enabled

Crashes in scsi_queue_rq or in dma_unmap_direct_sg during BFS when lpfc has
lpfc_enable_bg=1.

lpfc is setting DIX and prot sg after scsi_add_host_with_dma() has been
called. The scsi_host_set_prot() and scsi_host_set_guard() routines need to
be called before scsi_add_host_with_dma().

Revise the calling sequence to set the protection/guard data before calling
scsi_add_host_with_dma().

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix FDMI fc4type for nvme support
James Smart [Wed, 22 May 2019 00:49:08 +0000 (17:49 -0700)]
scsi: lpfc: Fix FDMI fc4type for nvme support

FDMI protocol support registration was not accurately showing nvme
support. The fcponly-path clears the parameter object.

Move the code out of the fcponly code path.  Fix the FDMI registration data
to properly check for nvme support.  Commonize the manner in which the fdmi
routines set protocol support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix fcp_rsp_len checking on lun reset
James Smart [Wed, 22 May 2019 00:49:07 +0000 (17:49 -0700)]
scsi: lpfc: Fix fcp_rsp_len checking on lun reset

Issuing a LUN reset was resulting in a command failure which then escalated
to a host reset.

The FCP-4 spec allows fcp_rsp_len field to specify the number of valid
bytes of FCP_RSP_INFO, and the value could be 4 or 8.  The driver is
allowing only a value of 8, thus it failed the command.

Revise the driver to allow 4 or 8.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix poor use of hardware queues if fewer irq vectors
James Smart [Wed, 22 May 2019 00:49:06 +0000 (17:49 -0700)]
scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors

While fixing the resources per socket, realized the driver was not using
hardware queues (up to 1 per cpu) if there were fewer interrupt
vectors. The driver was only using the hardware queue assigned to the cpu
with the vector.

Rework the affinity map check to use the additional hardware queue elements
that had been allocated.  If the cpu count exceeds the hardware queue count
- share, but choose what is shared with by: hyperthread peer, core peer,
socket peer, or finally similar cpu in a different socket.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix oops when driver is loaded with 1 interrupt vector
James Smart [Wed, 22 May 2019 00:49:05 +0000 (17:49 -0700)]
scsi: lpfc: Fix oops when driver is loaded with 1 interrupt vector

The driver was coded expecting enough hardware queues and interrupt vectors
such that at least there was one per socket. In the case where there were
fewer than sockets, cpus were left unassigned thus null pointers.

Rework the affinity mappings. Map settings for the cpu's that are in the
irq cpu mask. For each cpu not in the mask, map to another cpu that does
have a mask. Choice of the "other" cpu will attempt to map to the same cpu
but differing hyperthread, or cpu within in same core, or cpu within same
socket, or finally cpu in the base socket.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix incorrect logical link speed on trunks when links down
James Smart [Wed, 22 May 2019 00:49:04 +0000 (17:49 -0700)]
scsi: lpfc: Fix incorrect logical link speed on trunks when links down

Invalid logical speed is displayed for trunk enabled ports when all ports
are down. Also noted that link speed is incorrectly reported for the units
when links are up.

Current code is returning the logical link speed from the last event from
the adapter. In cases where the last link went down, the link speed in the
event was not valid - meaning that although the links where down the field
had a bogus value.

Rework the event handling to qualify the trunk link state before using the
event speed data.

Also correct units on other areas where the logical link speed was taken
from a link event.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix memory leak in abnormal exit path from lpfc_eq_create
James Smart [Wed, 22 May 2019 00:49:03 +0000 (17:49 -0700)]
scsi: lpfc: Fix memory leak in abnormal exit path from lpfc_eq_create

eq create is leaking mailbox memory if it encounters an error.

rework error path to free the memory.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Rework misleading nvme not supported in firmware message
James Smart [Wed, 22 May 2019 00:49:02 +0000 (17:49 -0700)]
scsi: lpfc: Rework misleading nvme not supported in firmware message

The driver unconditionally says fw doesn't support nvme when in
truth it was a driver parameter settings that disabled nvme support.

Rework the code validating nvme support to accurately report what
condition is disabling nvme support. Save state on whether nvme
fw supports nvme in case sysfs attributes change dynamically.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix hardlockup in scsi_cmd_iocb_cmpl
James Smart [Wed, 22 May 2019 00:49:01 +0000 (17:49 -0700)]
scsi: lpfc: Fix hardlockup in scsi_cmd_iocb_cmpl

There is a race condition with the abort handler declaring a waitq
item on it's stack, followed by a timeout in the abort handler that
has it give up on the abort return to its caller. When the io is
finally aborted and its completion handler called, it references
the waitq element that the abort_handler set up, which is no longer
valid resulting in a deadlock.

Fix by clearing the waitq reference, under lock, when the abort
handler timeout gives up. Have the completion handler validate the
waitq before referencing it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Cancel queued work for an IO when processing a received ABTS
James Smart [Wed, 22 May 2019 00:49:00 +0000 (17:49 -0700)]
scsi: lpfc: Cancel queued work for an IO when processing a received ABTS

When queued work is executed posting a new command to the transport
the driver is reporting a null buffer.

The driver had received an ABTS which matched a command that had
been scheduled for delivery to the transport. The driver proceeded
to cancel the command, but the work item was never cancelled.

Fix by cancelling the queued work item. Also turns out the ABTS
response was not properly sending a BA_ACC, so set the flag to
send the ACC.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Prevent 'use after free' memory overwrite in nvmet LS handling
James Smart [Wed, 22 May 2019 00:48:59 +0000 (17:48 -0700)]
scsi: lpfc: Prevent 'use after free' memory overwrite in nvmet LS handling

Use-after-free memory overwrite detected. Problem reported
by Ewan Milne at Red Hat after running lpfc target with additional
memory checking enabled.

Race condition when lpfc_nvmet_xmt_ls_rsp_cmp frees the ctxp
memory in interrupt context before lpfc_nvmet_xmt_ls_rsp
clears a field in the ctxp after successfully issuing the wqe.

Remove the unnecessary ctxp write after reposting the rq buffer. The
ctxp->rqb_buffer field is not checked in LS handling after the wqe
is submitted.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reported-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix PT2PT PLOGI collison stopping discovery
James Smart [Wed, 22 May 2019 00:48:58 +0000 (17:48 -0700)]
scsi: lpfc: Fix PT2PT PLOGI collison stopping discovery

Under heavy load the target stops responding, the drivers aborts
timeout and we start recovery by logging out of the target, but
the target is never logged into again.

In a point-to-point scenario, there were battling PLOGI's. When we
received a PLOGI request after having sent one, the driver cancels
the processing of the original plogi. However, the completion path
of the remaining plogi was coded to skip the reg_rpi that should
be happening on the 2nd plogi.

Correct by adding a simple pt2pt check such that the 2nd plogi isn't
skipped and the reg_login occurs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Revert message logging on unsupported topology
James Smart [Wed, 22 May 2019 00:48:57 +0000 (17:48 -0700)]
scsi: lpfc: Revert message logging on unsupported topology

Turns out the message change in 12.2.0.1 for unsupported topology
makes the linux driver out of sync with other products.

Revert the message back to the prior content for product consistency.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames
James Smart [Wed, 22 May 2019 00:48:56 +0000 (17:48 -0700)]
scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames

The driver currently is relying on firmware to match ABTSs to existing
exchanges. This works fine as long as an exchange has been assigned to the
io and work posted to it. However, for unmapped frames (rxid=0xFFFF), the
driver has yet to assign an xri. The driver was blindly saying it couldn't
match the ABTS and sending the BA_xxx. However, the command frame may have
been in queues waiting on xri's before posting to the nvmet_fc layer.  When
xri's became available, the command frame would still be pushed to the
transport and that io would execute, even though the io had been killed by
ABTS. The initiator, seeing the io ABTS'd, would reuse the exchange for a
different io which would be received on the target and pushed up. If the
"zombie" io then came back down and started transmitting, the initiator
would match the oxid and accept erroneous data. Bad things happened.

Add tracking of active exchanges in the target to allow matching of a
received ABTS against active or pending IO requests. If the ABTS is matched
to a pending or active IO, the drive initiates cleanup and conditionally
notifies the transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
5 years agoscsi: lpfc: Separate CQ processing for nvmet_fc upcalls
James Smart [Wed, 22 May 2019 00:48:55 +0000 (17:48 -0700)]
scsi: lpfc: Separate CQ processing for nvmet_fc upcalls

Currently the driver is notified of new command frame receipt by CQEs. As
part of the CQE processing, the driver upcalls the nvmet_fc transport to
deliver the command. nvmet_fc, as part of receiving the command builds out
a context for it, where one of the first steps is to allocate memory for
the io.

When running with tests that do large ios (1MB), it was found on some
systems, the total number of outstanding I/O's, at 1MB per, completely
consumed the system's memory. Thus additional ios were getting blocked in
the memory allocator.  Given that this blocked the lpfc thread processing
CQEs, there were lots of other commands that were received and which are
then held up, and given CQEs are serially processed, the aggregate delays
for an IO waiting behind the others became cummulative - enough so that the
initiator hit timeouts for the ios.

The basic fix is to avoid the direct upcall and instead schedule a work
item for each io as it is received. This allows the cq processing to
complete very quickly, and each io can then run or block on it's own.
However, this general solution hurts latency when there are few ios.  As
such, implemented the fix such that the driver watches how many CQEs it has
processed sequentially in one run. As long as the count is below a
threshold, the direct nvmet_fc upcall will be made. Only when the count is
exceeded will it revert to work scheduling.

Given that debug of this showed a surprisingly long delay in cq processing,
the io timer stats were updated to better reflect the processing of the
different points.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>