git.osdn.net Git - uclinux-h8/linux.git/log

scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()

It used to be that "error" was set to -ENODEV at the start of the function
but we shifted some code around an now "error" is set to zero for most
error paths. There is a mix of direct returns and "goto out" but I changed
everything to direct returns for consistency.

Fixes: 56de8357049c ("scsi: lpfc: fix calls to dma_set_mask_and_coherent()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libiscsi: Hold back_lock when calling iscsi_complete_task

If there is an error queueing an iscsi command in iscsi_queuecommand(), for
example if the transport fails to take the command in
sessuin->tt->xmit_task(), then the error path can call
iscsi_complete_task() without first aquiring the back_lock as
required. This can lead to things like ITT pool can get corrupt, resulting
in duplicate ITTs being sent out.

The solution is to hold the back_lock around iscsi_complete_task() calls,
and to add a little commenting to help others understand when back_lock
must be held.

Signed-off-by: Lee Duncan <lduncan@suse.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink

With default value of register SERDES_CFG, the link is not stable for some
special disks when running IO. According to HW guys' suggestion, need to
make the bit10~19 value of register SERDES_CFG the max value to increase
the reliability of the HiLink.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Reviewed-by: Yupeng Zhou <zhouyupeng1@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port

If we exchange SAS expander from one SAS controller to other SAS controller
without powering it down, the STP target port will maintain previous
affiliation and reject all subsequent connection requests from other STP
initiator ports with OPEN_REJECT (STP RESOURCES BUSY).

To solve this issue, send HARD RESET to clear the previous affiliation of
STP target port according to SPL (chapter 6.19.4).

We (re-)introduce dev status flag to know if to sleep in NEXUS reset code
or not for remote PHYs. The idea is that if the device is being
initialised, we don't require the delay, and caller would wait for link to
be established, cf. sas_ata_hard_reset().

Co-developed-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Set PHY linkrate when disconnected

When the PHY comes down, we currently do not set the negotiated linkrate:

root@(none)$ pwd
/sys/class/sas_phy/phy-0:0
root@(none)$ more enable
1
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$ echo 0 > enable
root@(none)$ more negotiated_linkrate
12.0 Gbit
root@(none)$

This patch fixes the driver code to set it properly when the PHY comes
down.

If the PHY had been enabled, then set unknown; otherwise, flag as disabled.

The logical place to set the negotiated linkrate for this scenario is PHY
down routine, which is called from the PHY down ISR.

However, it is not possible to know if the PHY comes down due to PHY
disable or loss of link, as sas_phy.enabled member is not set until after
the transport disable routine is complete, which races with the PHY down
ISR.

As an imperfect solution, use sas_phy_data.enable as the flag to know if
the PHY is down due to disable. It's imperfect, as sas_phy_data is internal
to libsas.

I can't see another way without adding a new field to hisi_sas_phy and
managing it, or changing SCSI SAS transport.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw

The later revision of v3 hw has added an function of interrupt coalesce
according to time for PHY RX errors. We set the coalesce time to 1s. Then
we print PHY RX errors count when PHY RX errors happen, and don't need to
worry that there may be too much log prints.

Besides, we use hisi_sas_phy.lock to protect error count value. Because we
update them by calling phy_get_events_v3_hw(), which is also used by core
driver (for get PHY events function).

We relocate phy_get_events_v3_hw() to avoid a further declaration.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO

For internal IO and SMP IO, there is a time-out timer for them. In the
timer handler, it checks whether IO is done according to the flag
task->task_state_lock.

There is an issue which may cause system suspended: internal IO or SMP IO
is sent, but at that time because of hardware exception (such as inject
2Bit ECC error), so IO is not completed and also not timeout. But, at that
time, the SAS controller reset occurs to recover system. It will release
the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO
timeout, it will never complete the completion of IO and wait for ever.

[  729.123632] Call trace:
[  729.126791] [<ffff00000808655c>] __switch_to+0x94/0xa8
[  729.133106] [<ffff000008d96e98>] __schedule+0x1e8/0x7fc
[  729.138975] [<ffff000008d974e0>] schedule+0x34/0x8c
[  729.144401] [<ffff000008d9b000>] schedule_timeout+0x1d8/0x3cc
[  729.150690] [<ffff000008d98218>] wait_for_common+0xdc/0x1a0
[  729.157101] [<ffff000008d98304>] wait_for_completion+0x28/0x34
[  729.165973] [<ffff000000dcefb4>] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main]
[  729.176447] [<ffff000000dd18f4>] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main]
[  729.185258] [<ffff000008971714>] sas_eh_handle_sas_errors+0x1c8/0x7b8
[  729.192391] [<ffff000008972774>] sas_scsi_recover_host+0x130/0x398
[  729.199237] [<ffff00000894d8a8>] scsi_error_handler+0x148/0x5c0
[  729.206009] [<ffff0000080f4118>] kthread+0x10c/0x138
[  729.211563] [<ffff0000080855dc>] ret_from_fork+0x10/0x18

To solve the issue, callback function task_done of those IOs need to be
called when on SAS controller reset.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Change return variable type in phy_up_v3_hw()

According to the tool fortify, phy_up_v3_hw() returns signed value, while
it should return an unsigned value.

So change variable "res" from int to irq_return_t.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: check for kstrtol() failure

The error handling was unintentionally left out so it introduces a Smatch
static checker warning:

drivers/scsi/qla2xxx/qla_attr.c:1655 qla2x00_port_speed_store()
error: uninitialized symbol 'type'.

Fixes: a7b9ca7fc87a ("scsi: qla2xxx: Add support for setting port speed")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: fix 32-bit format string warning

On 32-bit architectures, we see a warning when %ld is used to print a
size_t:

In file included from drivers/scsi/lpfc/lpfc_init.c:62:
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_new_io_buf':
drivers/scsi/lpfc/lpfc_logmsg.h:62:45: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'unsigned int' [-Werror=format=]

This is harmless, but portable code should just use %zd to avoid the
warning.

Fixes: 0794d601d174 ("scsi: lpfc: Implement common IO buffers between NVME and SCSI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: fix unused variable warning

The newly introduced 'cpu' variable is only used inside of an optional
block, so we get a warning without CONFIG_SCSI_LPFC_DEBUG_FS:

drivers/scsi/lpfc/lpfc_nvme.c: In function 'lpfc_nvme_io_cmd_wqe_cmpl':
drivers/scsi/lpfc/lpfc_nvme.c:968:30: error: unused variable 'cpu' [-Werror=unused-variable]
uint32_t code, status, idx, cpu;

Move the declaration into the same block to avoid the warning.

Fixes: 63df6d637e33 ("scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: Switch to bitmap_zalloc()

Switch to bitmap_zalloc() to show clearly what we are allocating. Besides
that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libiscsi: fall back to sendmsg for slab pages

In "XFS over network block device" scenario XFS can create IO requests with
slab-based XFS metadata. During processing such requests tcp_sendpage() can
merge skb fragments with neighbour slab objects.

If receiving side is located on the same host tcp_recvmsg() can trigger
BUG_ON in hardening check and crash the host with following message:

usercopy: kernel memory exposure attempt detected
from XXXXXXXX (kmalloc-512) (1024 bytes)

This patch redirect such requests from sednpage to sendmsg path. The
problem is similar to one described in recent commit 7e241f647dc7
("libceph: fall back to sendmsg for slab pages")

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: avoid printf format warning

Depending on the target architecture and configuration, both phys_addr_t
and dma_addr_t may be smaller than 'long long', so we get a warning when
printing either of them using the %llx format string:

drivers/scsi/qla2xxx/qla_iocb.c: In function 'qla24xx_walk_and_build_prot_sglist':
drivers/scsi/qla2xxx/qla_iocb.c:1140:46: error: format '%llx' expects argument of type 'long long unsigned int', but argument 6 has type 'dma_addr_t' {aka 'unsigned int'} [-Werror=format=]
         "%s: page boundary crossing (phys=%llx len=%x)\n",
                                           ~~~^
                                           %x
         __func__, sle_phys, sg->length);
                   ~~~~~~~~
drivers/scsi/qla2xxx/qla_iocb.c:1180:29: error: format '%llx' expects argument of type 'long long unsigned int', but argument 7 has type 'dma_addr_t' {aka 'unsigned int'} [-Werror=format=]
        "%s: sg[%x] (phys=%llx sglen=%x) ldma_sg_len: %x dif_bundl_len: %x ldma_needed: %x\n",
                          ~~~^

There are special %pad and %pap format strings in Linux that we could use
here, but since the driver already does 64-bit arithmetic on the values,
using a plain 'u64' seems more consistent here.

Note: A possible related issue may be that the driver possibly checks the
wrong kind of overflow: when an IOMMU is in use, buffers that cross a
32-bit boundary in physical addresses would still be mapped into dma
addresses within the low 4GB space, so I suspect that we actually want to
check sg_dma_address() instead of sg_phys() here.

Fixes: 50b812755e97 ("scsi: qla2xxx: Fix DMA error when the DIF sg buffer crosses 4GB boundary")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset

The patch that replaced io channels for hdw_queues now reports the
following static checker warning:

drivers/scsi/lpfc/lpfc_init.c:11136 lpfc_sli4_hba_unset()
error: we previously assumed 'phba->pport' could be null (see line 11074)

Resolve by adding a pport NULL check.

[mkp: tag tweak]

Fixes: cdb42becdd40 ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu"_
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check

The outer routine lpfc_sli_issue_iocb(), which decomposes into the
SLI3 (s3) or SLI4 (s4) subroutines takes out the locks. For s3, it takes
out the hbalock. For s4, it takes out the ring_lock. The lockdep check in
the s3 and s4 subroutines both check hbalock, which is incorrect for s4.

Revise the s4 subroutine to lockdep check the ring_lock.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: hisi: fix ufs_hba_variant_ops passing

Without CONFIG_OF, the of_match_node() helper does not evaluate its
argument, and the compiler warns about the unused variable:

drivers/scsi/ufs/ufs-hisi.c: In function 'ufs_hisi_probe':
drivers/scsi/ufs/ufs-hisi.c:673:17: error: unused variable 'dev' [-Werror=unused-variable]

Rework this code to pass the data directly, and while we're at it,
correctly handle the const pointers.

Fixes: 653fcb07d95e ("scsi: ufs: Add HI3670 SoC UFS driver support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show

When trying to display tgt_counters in the debugfs, a panic can result.

There is no null check for qpair after it is assigned in the for-loop.
Unless vha->hw->queue_pair_map array is completely filled with entries, the
system will panic dereferencing a null pointer.

Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: reduce module load time

megaraid_sas takes 1+ seconds to load while waiting for firmware:

[2.822603] megaraid_sas 0000:03:00.0: Waiting for FW to come to ready state
[3.871003] megaraid_sas 0000:03:00.0: FW now in Ready state

This is due to the following loop in megasas_transition_to_ready(), which
waits a minimum of 1 second, even though the FW becomes ready in tens of
millisecs:

        /*
         * The cur_state should not last for more than max_wait secs
         */
        for (i = 0; i < max_wait; i++) {
                ...
                msleep(1000);
        ...
        dev_info(&instance->pdev->dev, "FW now in Ready state\n");

This is a regression, caused by a change of the msleep granularity from 1
to 1000 due to concern about waiting too long on systems with coarse
jiffies.

To fix, increase iterations and use msleep(20), which results in:

[2.670627] megaraid_sas 0000:03:00.0: Waiting for FW to come to ready state
[2.739386] megaraid_sas 0000:03:00.0: FW now in Ready state

Fixes: fb2f3e96d80f ("scsi: megaraid_sas: Fix msleep granularity")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: tcmu: wait for nl reply only if there are listeners or during an add

genlmsg_multicast_allns now returns the correct statuses when a message is
sent to a listener. However in the case of adding a device we want to wait
for the listener otherwise we may miss the the device during startup.

Signed-off-by: Cathy Avery <cavery@redhat.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: virtio_scsi: don't send sc payload with tmfs

The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of
device-readable records and a single device-writable response entry:

    struct virtio_scsi_ctrl_tmf
    {
        // Device-readable part
        le32 type;
        le32 subtype;
        u8 lun[8];
        le64 id;
        // Device-writable part
        u8 response;
    }

The above should be organised as two descriptor entries (or potentially
more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64
id" or after "u8 response".

The Linux driver doesn't respect that, with virtscsi_abort() and
virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf().  It
results in the original scsi command payload (or writable buffers) added to
the tmf.

This fixes the problem by leaving cmd->sc zeroed out, which makes
virtscsi_kick_cmd() add the tmf to the control vq without any payload.

Cc: stable@vger.kernel.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: smartpqi: Reporting 'logical unit failure'

When the HARDWARE_ERROR/0x3e/0x1 case is triggered, the logical volume
is offlined. When reading the kernel log, the reason why the device
got offlined isn't reported to the user. This situation makes it
difficult for admins to root cause.

Log a message when this condition occurs.

[mkp: tweaked commit message]

Signed-off-by: Erwan Velu <e.velu@criteo.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: cxgb4i: validate tcp sequence number only if chip version <= T5

T6 adapters generates DDP completion message on receiving all iSCSI pdus in
a sequence. Because of this, driver can not keep track of tcp sequence
number for T6 adapters.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: cxgb4i: get pf number from lldi->pf

Instead of using viid to get pf number, directly get pf number from
lldi->pf.

Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c

We had a test-report where, under memory pressure, adding LUNs to the
systems would fail (the tests add LUNs strictly in sequence):

[ 5525.853432] scsi 0:0:1:1088045124: Direct-Access     IBM      2107900          .148 PQ: 0 ANSI: 5
[ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS
[ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43
[ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0
[ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection
[ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off
[ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08
[ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 5525.857838]  sdk: sdk1
[ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk
[ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds
[ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
[ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured

Looking at the code of scsi_alloc_sdev(), and all the calling contexts,
there seems to be no reason to use GFP_ATMOIC here. All the different
call-contexts use a mutex at some point, and nothing in between that
requires no sleeping, as far as I could see. Additionally, the code that
later allocates the block queue for the device (scsi_mq_alloc_queue())
already uses GFP_KERNEL.

There are similar allocations in two other functions:
scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with
GFP_KERNEL.

Here is the contexts for the three functions so far:

    scsi_alloc_sdev()
        scsi_probe_and_add_lun()
            scsi_sequential_lun_scan()
                __scsi_scan_target()
                    scsi_scan_target()
                        mutex_lock()
                    scsi_scan_channel()
                        scsi_scan_host_selected()
                            mutex_lock()
            scsi_report_lun_scan()
                __scsi_scan_target()
                 ...
            __scsi_add_device()
                mutex_lock()
            __scsi_scan_target()
                ...
        scsi_report_lun_scan()
            ...
        scsi_get_host_dev()
            mutex_lock()

    scsi_probe_and_add_lun()
        ...

    scsi_add_lun()
        scsi_probe_and_add_lun()
            ...

So replace all these, and give them a bit of a better chance to succeed,
with more chances of reclaim.

Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Add missing breaks in switch statements

Fix the following warnings by adding the proper missing breaks:

drivers/scsi/mpt3sas/mpt3sas_base.c: In function  _base_display_OEMs_branding :
drivers/scsi/mpt3sas/mpt3sas_base.c:3548:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
    switch (ioc->pdev->subsystem_device) {
    ^~~~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3566:3: note: here
   case MPI2_MFGPAGE_DEVID_SAS2308_2:
   ^~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3567:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
    switch (ioc->pdev->subsystem_device) {
    ^~~~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3601:3: note: here
   case MPI25_MFGPAGE_DEVID_SAS3008:
   ^~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3735:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
    switch (ioc->pdev->subsystem_device) {
    ^~~~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3745:3: note: here
   case MPI2_MFGPAGE_DEVID_SAS2308_2:
   ^~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3746:4: warning: this statement may fall through [-Wimplicit-fallthrough=]
    switch (ioc->pdev->subsystem_device) {
    ^~~~~~
drivers/scsi/mpt3sas/mpt3sas_base.c:3768:3: note: here
   default:
   ^~~~~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: aacraid: Fix missing break in switch statement

Add missing break statement and fix identation issue.

This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.

Fixes: 9cb62fa24e0d ("aacraid: Log firmware AIF messages")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: kill command serial number

No users left, kill it.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: csiostor: drop serial_number usage

Use request tag instead of the serial number when printing out logging
messages.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mvumi: use request tag instead of serial_number

Use the request tag for logging instead of the scsi command serial number.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: dpt_i2o: remove serial number usage

Drop references to scsi_cmnd->serial_number.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: st: osst: Remove negative constant left-shifts

Negative constant left-shift is undefined behaviour in the C standard, and
as such newer versions of clang (at least) warn against it. GCC supports it
for a long time, but it would be better to remove it and rely on defined
behaviour.

My understanding is "~(-1 << N)" in 2's complement is intended to generate
a bit pattern of zeroes ending with N '1' bits. The same can be achieved by
"(1 << N) - 1" in a well-defined way, so switch to it to remove the
warning.

Tested: building a kernel with generic SCSI tape, and checking basic
operations (mt status, mt eject) on a real LTO unit. Cannot test the osst
driver.

Signed-off-by: Iustin Pop <iustin@k1024.org>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs-bsg: Allow reading descriptors

Add this functionality, placing the descriptor being read in the actual
data buffer in the bio.

That is, for both read and write descriptors query upiu, we are using the
job's request_payload. This in turn, is mapped back in user land to the
applicable sg_io_v4 xferp: dout_xferp for write descriptor, and din_xferp
for read descriptor.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Allow reading descriptor via raw upiu

Allow to read descriptors via raw upiu. This in fact was forbidden just as
a precaution, as ufs-bsg actually enforces which functionality is
supported.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs-bsg: Change the calling convention for write descriptor

When we had a write descriptor query upiu, we appended the descriptor right
after the bsg request. This was fine as the bsg driver allows to allocate
whatever buffer we needed in its job request.

Still, the proper way to deliver payload, however small (we only write
config descriptors of 144 bytes), is by using the job request payload data
buffer.

So change this ABI now, while ufs-bsg is still new, and nobody is actually
using it.

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Remove unused device quirks

The UFSHC driver defines a few quirks that are not used anywhere:

UFS_DEVICE_QUIRK_BROKEN_LCC
UFS_DEVICE_NO_VCCQ
UFS_DEVICE_QUIRK_NO_LINK_OFF
UFS_DEVICE_NO_FASTAUTO

Let's remove them.

Acked-by: Avri Altman <avri.altman@wdc.com>
Acked-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

Revert "scsi: ufs: disable vccq if it's not needed by UFS device"

This reverts commit 60f0187031c05e04cbadffb62f557d0ff3564490.

There was one conflict in drivers/scsi/ufs/ufshcd.c

<<<<<<< HEAD
/* Init check for device descriptor sizes */
ufshcd_init_desc_sizes(hba);

ret = ufs_get_device_desc(hba, &card);
if (ret) {
dev_err(hba->dev, "%s: Failed getting device info. err = %d\n",
__func__, ret);
goto out;
}

ufs_fixup_device_setup(hba, &card);
ufshcd_tune_unipro_params(hba);

ret = ufshcd_set_vccq_rail_unused(hba,
(hba->dev_quirks & UFS_DEVICE_NO_VCCQ) ? true : false);
if (ret)
goto out;

=======
ufs_advertise_fixup_device(hba);
>>>>>>> parent of 60f0187031c0... scsi: ufs: disable vccq if it's not needed by UFS device

Resolution: keep HEAD, and delete the ufshcd_set_vccq_rail_unused() call
and corresponding error-handling code.

Clean up loose ends in a follow-up patch.

60f0187031c0 introduced a small power optimization: ignore the vccq load
specified in the UFSHC DT node when said host controller is connected to
specific Flash chips (currently, Samsung and Hynix).

Unfortunately, this optimization breaks UFS on systems where vccq powers
not only the Flash chip, but the host controller as well, such as APQ8098
MEDIABOX or MTP8998:

[    3.929877] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11
[    5.433815] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11
[    6.937771] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr: opcode 0x04 for idn 13 failed, index 0, err = -11
[    6.937866] ufshcd-qcom 1da4000.ufshc: ufshcd_query_attr_retry: query attribute, idn 13, failed with error -11 after 3 retires
[    6.946412] ufshcd-qcom 1da4000.ufshc: ufshcd_disable_auto_bkops: failed to enable exception event -11
[    6.957972] ufshcd-qcom 1da4000.ufshc: dme-peer-get: attr-id 0x1587 failed 3 retries
[    6.967181] ufshcd-qcom 1da4000.ufshc: dme-peer-get: attr-id 0x1586 failed 3 retries
[    6.975025] ufshcd-qcom 1da4000.ufshc: ufshcd_get_max_pwr_mode: invalid max pwm tx gear read = 0
[    6.982755] ufshcd-qcom 1da4000.ufshc: ufshcd_probe_hba: Failed getting max supported power mode
[    8.505770] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11
[   10.009807] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11
[   11.513766] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag: Sending flag query for idn 3 failed, err = -11
[   11.513861] ufshcd-qcom 1da4000.ufshc: ufshcd_query_flag_retry: query attribute, opcode 5, idn 3, failed with error -11 after 3 retires
[   13.049807] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11
[   14.553768] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11
[   16.057767] ufshcd-qcom 1da4000.ufshc: __ufshcd_query_descriptor: opcode 0x01 for idn 8 failed, index 0, err = -11
[   16.057872] ufshcd-qcom 1da4000.ufshc: ufshcd_read_desc_param: Failed reading descriptor. desc_id 8, desc_index 0, param_offset 0, ret -11
[   16.067109] ufshcd-qcom 1da4000.ufshc: ufshcd_init_icc_levels: Failed reading power descriptor.len = 98 ret = -11
[   37.073787] ufshcd-qcom 1da4000.ufshc: link startup failed 1

In my opinion, the rationale for the original patch is questionable.  If
neither the UFSHC, nor the Flash chip, require any load from vccq, then
that power rail should simply not be specified at all in the DT.

Working around that fact in the driver is detrimental, as evidenced by the
failure to initialize the host controller on MSM8998.

Acked-by: Avri Altman <avri.altman@wdc.com>
Acked-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: Remove a bunch of set but not used variables

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'wait_and_poll':
drivers/scsi/megaraid/megaraid_sas_fusion.c:936:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_sync_map_info':
drivers/scsi/megaraid/megaraid_sas_fusion.c:1329:6: warning:
variable 'size_sync_info' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_init_adapter_fusion':
drivers/scsi/megaraid/megaraid_sas_fusion.c:1639:39: warning:
variable 'reg_set' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_is_prp_possible':
drivers/scsi/megaraid/megaraid_sas_fusion.c:1925:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_make_prp_nvme':
drivers/scsi/megaraid/megaraid_sas_fusion.c:2047:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_build_ldio_fusion':
drivers/scsi/megaraid/megaraid_sas_fusion.c:2620:42: warning:
variable 'req_desc' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_build_and_issue_cmd_fusion':
drivers/scsi/megaraid/megaraid_sas_fusion.c:3245:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_task_abort_fusion':
drivers/scsi/megaraid/megaraid_sas_fusion.c:4398:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_reset_target_fusion':
drivers/scsi/megaraid/megaraid_sas_fusion.c:4484:25: warning:
variable 'fusion' set but not used [-Wunused-but-set-variable]

They're not used anymore and can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: clean obsolete return values of eh_timed_out

Those are no longer in use since commit 242f9dcb8ba6
("block: unify request timeout handling").

Signed-off-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Optimal I/O size should be a multiple of physical block size

It was reported that some devices report an OPTIMAL TRANSFER LENGTH of
0xFFFF blocks. That looks bogus, especially for a device with a
4096-byte physical block size.

Ignore OPTIMAL TRANSFER LENGTH if it is not a multiple of the device's
reported physical block size.

To make the sanity checking conditionals more readable--and to
facilitate printing warnings--relocate the checking to a helper
function. No functional change aside from the printks.

Cc: <stable@vger.kernel.org>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199759
Reported-by: Christoph Anton Mitterer <calestyo@scientia.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: MAINTAINERS: SCSI initiator and target tweaks

Nic has been absent for a while and target changes now go through the SCSI
tree. To avoid confusion wrt. the NVMe target, clarify that this entry
refers to the SCSI target subsystem.

Also add patchwork links for both SCSI initiator and target.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: fcoe: make use of fip_mode enum complete

commit 1917d42d14b7 ("fcoe: use enum for fip_mode") introduces a separate
enum for the fip_mode that shall be used during initialisation handling
until it is passed to fcoe_ctrl_link_up to set the initial fip_state. That
change was incomplete and gcc quietly converted in various places between
the fip_mode and the fip_state enum values with implicit enum conversions,
which fortunately cannot cause any issues in the actual code's execution.

clang however warns about these implicit enum conversions in the scsi
drivers. This commit consolidates the use of the two enums, guided by
clang's enum-conversion warnings.

This commit now completes the use of the fip_mode: It expects and uses
fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls
fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up(). It
also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to
indicate these two enums are distinct.

Link: https://github.com/ClangBuiltLinux/linux/issues/151
Fixes: 1917d42d14b7 ("fcoe: use enum for fip_mode")
Reported-by: Dmitry Golovin <dima@golovin.in>
Original-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
CC: Lukas Bulwahn <lukas.bulwahn@gmail.com>
CC: Nick Desaulniers <ndesaulniers@google.com>
CC: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: return error when create DMA pool failed

when create DMA pool for cmd frames failed, we should return -ENOMEM,
instead of 0.
In some case in:

    megasas_init_adapter_fusion()

    -->megasas_alloc_cmds()
       -->megasas_create_frame_pool
          create DMA pool failed,
        --> megasas_free_cmds() [1]

    -->megasas_alloc_cmds_fusion()
       failed, then goto fail_alloc_cmds.
    -->megasas_free_cmds() [2]

we will call megasas_free_cmds twice, [1] will kfree cmd_list,
[2] will use cmd_list.it will cause a problem:

Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd = ffffffc000f70000
[00000000] *pgd=0000001fbf893003, *pud=0000001fbf893003,
*pmd=0000001fbf894003, *pte=006000006d000707
Internal error: Oops: 96000005 [#1] SMP
Modules linked in:
CPU: 18 PID: 1 Comm: swapper/0 Not tainted
task: ffffffdfb9290000 ti: ffffffdfb923c000 task.ti: ffffffdfb923c000
PC is at megasas_free_cmds+0x30/0x70
LR is at megasas_free_cmds+0x24/0x70
...
Call trace:
[<ffffffc0005b779c>] megasas_free_cmds+0x30/0x70
[<ffffffc0005bca74>] megasas_init_adapter_fusion+0x2f4/0x4d8
[<ffffffc0005b926c>] megasas_init_fw+0x2dc/0x760
[<ffffffc0005b9ab0>] megasas_probe_one+0x3c0/0xcd8
[<ffffffc0004a5abc>] local_pci_probe+0x4c/0xb4
[<ffffffc0004a5c40>] pci_device_probe+0x11c/0x14c
[<ffffffc00053a5e4>] driver_probe_device+0x1ec/0x430
[<ffffffc00053a92c>] __driver_attach+0xa8/0xb0
[<ffffffc000538178>] bus_for_each_dev+0x74/0xc8
  [<ffffffc000539e88>] driver_attach+0x28/0x34
[<ffffffc000539a18>] bus_add_driver+0x16c/0x248
[<ffffffc00053b234>] driver_register+0x6c/0x138
[<ffffffc0004a5350>] __pci_register_driver+0x5c/0x6c
[<ffffffc000ce3868>] megasas_init+0xc0/0x1a8
[<ffffffc000082a58>] do_one_initcall+0xe8/0x1ec
[<ffffffc000ca7be8>] kernel_init_freeable+0x1c8/0x284
[<ffffffc0008d90b8>] kernel_init+0x1c/0xe4

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Avoid PCI IRQ affinity mapping when multiqueue is not supported

This patch fixes warning seen when BLK-MQ is enabled and hardware does not
support MQ. This will result into driver requesting MSIx vectors which are
equal or less than pre_desc via PCI IRQ Affinity infrastructure.

    [   19.746300] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.12-k.
    [   19.746599] qla2xxx [0000:02:00.0]-001d: : Found an ISP2432 irq 18 iobase 0x(____ptrval____).
    [   20.203186] ------------[ cut here ]------------
    [   20.203306] WARNING: CPU: 8 PID: 268 at drivers/pci/msi.c:1273 pci_irq_get_affinity+0xf4/0x120
    [   20.203481] Modules linked in: tg3 ptp qla2xxx(+) pps_core sg libphy scsi_transport_fc flash loop autofs4
    [   20.203700] CPU: 8 PID: 268 Comm: systemd-udevd Not tainted 5.0.0-rc5-00358-gdf3865f #113
    [   20.203830] Call Trace:
    [   20.203933]  [0000000000461bb0] __warn+0xb0/0xe0
    [   20.204090]  [00000000006c8f34] pci_irq_get_affinity+0xf4/0x120
    [   20.204219]  [000000000068c764] blk_mq_pci_map_queues+0x24/0x120
    [   20.204396]  [00000000007162f4] scsi_map_queues+0x14/0x40
    [   20.204626]  [0000000000673654] blk_mq_update_queue_map+0x94/0xe0
    [   20.204698]  [0000000000676ce0] blk_mq_alloc_tag_set+0x120/0x300
    [   20.204869]  [000000000071077c] scsi_add_host_with_dma+0x7c/0x300
    [   20.205419]  [00000000100ead54] qla2x00_probe_one+0x19d4/0x2640 [qla2xxx]
    [   20.205621]  [00000000006b3c88] pci_device_probe+0xc8/0x160
    [   20.205697]  [0000000000701c0c] really_probe+0x1ac/0x2e0
    [   20.205770]  [0000000000701f90] driver_probe_device+0x50/0x100
    [   20.205843]  [0000000000702134] __driver_attach+0xf4/0x120
    [   20.205913]  [0000000000700644] bus_for_each_dev+0x44/0x80
    [   20.206081]  [0000000000700c98] bus_add_driver+0x198/0x220
    [   20.206300]  [0000000000702950] driver_register+0x70/0x120
    [   20.206582]  [0000000010248224] qla2x00_module_init+0x224/0x284 [qla2xxx]
    [   20.206857] ---[ end trace b1de7a3f79fab2c2 ]---

The fix is to check if the hardware does not have Multi Queue capabiltiy,
use pci_alloc_irq_vectors() call instead of pci_alloc_irq_affinity().

Fixes: f664a3cc17b7d ("scsi: kill off the legacy IO path")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Update driver version to 10.00.00.14-k

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Add new FW dump template entry types

This patch adds new firmware dump template entries for ISP27XX firmware
dump.

Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix code indentation for qla27xx_fwdt_entry

This patch fixes following checkpatch ERROR

ERROR: space prohibited before that ',' (ctx:WxW)

No change is functionality due to this patch.

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Move marker request behind QPair

Current code hard codes marker request to use request and response queue
0. This patch make use of the qpair as the path to access the
request/response queues. It allows marker to be place on any hardware
queue.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Prevent SysFS access when chip is down

Prevent user from sending commands through sysfs while FW is not running or
reset is in progress.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Add support for setting port speed

This patch adds sysfs node

1. There is a new sysfs node port_speed
2. The possible values are 2(Auto neg), 8, 16, 32
3. A value outside of the above defaults to Auto neg
4. Any update to the setting causes a link toggle
5. This feature is currently only for ISP27xx

Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Prevent multiple ADISC commands per session

Add check to allow 1 discovery command per session to be sent.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Check for FW started flag before aborting

For FC-NVMe, if the fw_started flag is not set or fcport is deleted, then
do not send Abort command

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix unload when NVMe devices are configured

This patch fixes driver unload issue when FC-NVMe devices are configured.

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Add First Burst support for FC-NVMe devices

Add Support for First Burst for FC-NVMe protocol. This feature requires
First Burst support in the firmware.

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix LUN discovery if loop id is not assigned yet by firmware

This patch fixes LUN discovery when loop ID is not yet assigned by the
firmware during driver load/sg_reset operations. Driver will now search for
new loop id before retrying login.

Fixes: 48acad099074 ("scsi: qla2xxx: Fix N2N link re-connect")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: remove redundant null check on pointer sess

The null check on pointer sess and the subsequent call is redundant as sess
is null on all the the paths that lead to the out_term2 label. Hence the
null check and the call can be removed. Also remove the redundant setting
of sess to NULL as this is not required now.

Detected by CoverityScan, CID#1420663 ("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Move debug messages before sending srb preventing panic

When sending an srb with qla2x00_start_sp, the sp can complete and be freed
by the time we log the debug message saying we sent it. This can cause a
panic if sp gets reused quickly or when running a kernel that poisons freed
memory.

This was partially fixed by (not every case was addressed):

Commit 9fe278f44b4b ("scsi: qla2xxx: Move log messages before issuing
command to firmware")

Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Remove set but not used variable 'phys_id'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_cpu_affinity_check':
drivers/scsi/lpfc/lpfc_init.c:10599:19: warning:
variable 'phys_id' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit 6a828b0f6192 ("scsi: lpfc:
Support non-uniform allocation of MSIX vectors to hardware queues")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: Add HI3670 SoC UFS driver support

Add HI3670 SoC UFS driver support by extending the common ufs-hisi
driver. One major difference between HI3660 ad HI3670 SoCs interms of UFS
is the PHY. HI3670 has a 10nm variant PHY and hence this parameter is used
to distinguish the configuration.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Wei Li <liwei213@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: dt-bindings: ufs: Add HI3670 UFS controller binding

Add devicetree binding for HI3670 UFS controller. HI3760 SoC is very
similar to HI3660 SoC with almost same IPs. Only major difference in terms
of UFS is the PHY. HI3670 has 10nm PHY. But since the original driver
(HI3660 UFS) cannot make HI3670 UFS functional, a separate compatible
is added for HI3670 without any fallback.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Wei Li <liwei213@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: fix a handful of indentation issues

There are a handful of statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qlogicpti: Use of_node_name_eq for node name comparisons

Convert string compares of DT node names to use of_node_name_eq helper
instead. This removes direct access to the node name pointer.

As prom_name is not used for anything else, remove it.

Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Fix typo in sd_first_printk()

Commit b2bff6ceb61a9 ("[SCSI] sd: Quiesce mode sense error messages")
added the macro sd_first_printk(). The macro takes "sdsk" as argument
but dereferences "sdkp". This hasn't caused any real issues since all
callers of sd_first_printk() have an sdkp. But fix the typo.

[mkp: Turned this into a real patch and tweaked commit description]

Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: driver version update

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: Update structures for HOST_DEVICE_LIST DCMD

Add padding to make the structure variables in MR_HOST_DEVICE_LIST_ENTRY
64-bit aligned. Also, add reserved fields to MR_HOST_DEVICE_LIST for
future firmware usage.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix error code if kcalloc() fails

This should return -ENOMEM if kcalloc() fails, but it accidentally
returns success instead.

Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ufs: fix a typo in comment

poitner -> pointer.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Pedro Sousa <pedrom.sousa@synopsys.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_debug: Implement support for write protect

Teach scsi_debug to honor SWP in the Control Mode Page and report the
resulting WP state in the Device-Specific Parameter field.

In check_device_access_params() verify that commands that will write
the medium are permitted to do so.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>

scsi: core: Move resid from scsi_data_buffer to scsi_cmnd

This patch does not change any functionality but reduces the size of
struct scsi_cmnd.

Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd: Remove superfluous residual assignments

Since commit 26e85fcd15f6 ("[SCSI] sd: Permit merged discard requests";
kernel v3.10) sd_done() sets the residual not only for failed special
requests but also for special requests that succeeded. Hence remove the
code from functions called by sd_init_command() that sets the residual.
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: uas: Use scsi_[gs]et_resid() where appropriate

This patch does not change any functionality.

Cc: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Oliver Neukum <oneukum@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_debug: Use scsi_[gs]et_resid() where appropriate

This patch does not change any functionality.

Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libiscsi: Use scsi_[gs]et_resid() where appropriate

This patch does not change any functionality.

Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_debug: Fix a recently introduced regression

A recent commit removed an element from opcode_info_arr[] but did not
modify opcode_ind_arr[] nor was SDEB_I_XDWRITEREAD removed. Remove
SDEB_I_XDWRITEREAD and bring the two arrays again in sync. This patch
avoids that the following is reported:

BUG: KASAN: null-ptr-deref in scsi_debug_queuecommand+0x60f/0xc90 [scsi_debug]
Read of size 1 at addr 0000000000000001 by task iscsi-test-cu/683
CPU: 3 PID: 683 Comm: iscsi-test-cu Not tainted 5.0.0-rc5-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x86/0xca
kasan_report.cold.3+0x5/0x3e
__asan_load1+0x47/0x50
scsi_debug_queuecommand+0x60f/0xc90 [scsi_debug]
scsi_queue_rq+0xc17/0x12e0
blk_mq_dispatch_rq_list+0x5fc/0xb10
blk_mq_sched_dispatch_requests+0x2f7/0x300
__blk_mq_run_hw_queue+0xd6/0x180
__blk_mq_delay_run_hw_queue+0x25c/0x290
blk_mq_run_hw_queue+0x119/0x1b0
blk_mq_sched_insert_request+0x274/0x350
blk_execute_rq_nowait+0x78/0x90
blk_execute_rq+0xcc/0x140
sg_io+0x30f/0x700
scsi_cmd_ioctl+0x4d4/0x540
scsi_cmd_blk_ioctl+0x7b/0x8b
sd_ioctl+0xba/0x150
blkdev_ioctl+0x6e1/0xea0
block_ioctl+0x79/0x90
do_vfs_ioctl+0x12b/0x9b0
ksys_ioctl+0x41/0x80
__x64_sys_ioctl+0x43/0x50
do_syscall_64+0x71/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cc: Christoph Hellwig <hch@lst.de>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Fixes: ae3d56d81507 ("scsi: remove bidirectional command support")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Do some more tidy-up

Do some very minor tidy-up, for things like needlessly initing variable and
not leaving whitespace before quote endings.

Originally-from: Xiang Chen <chenxiang66@hisilicon.com>
Originally-from: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental

For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.

Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.

For user control irq affinity mode, keep it as before.

To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.

We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2

We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Issue internal abort on all relevant queues

To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.

Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.

To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.

To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.

So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.

For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: change queue depth from 512 to 4096

If sending IOs to many disks from single queue, it is possible that the
queue may be full. To avoid the situation, change queue depth from 512 to
4096 which is the max number of IOs for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Add manual trigger for debugfs dump

Add an interface to manually trigger a debugfs dump.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: Add support for DIX feature for v3 hw

This patch adds support for DIX to v3 hw driver.

For this, we build upon support for DIF, most significantly is adding new
DMA map and unmap paths.

Some pre-existing macro precedence issues are also tidied. They were
detected by checkpatch --strict.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: dt-bindings: ufs: Fix the compatible string definition

If you look at the bindings for the UFS Host Controller it says:

- compatible: must contain "jedec,ufs-1.1" or "jedec,ufs-2.0", may
              also list one or more of the following:
                 "qcom,msm8994-ufshc"
                 "qcom,msm8996-ufshc"
                 "qcom,ufshc"

My reading of that is that it's fine to just have either of these:
1. "qcom,msm8996-ufshc", "jedec,ufs-2.0"
2. "qcom,ufshc", "jedec,ufs-2.0"

As far as I can tell neither of the above is actually a good idea.

For #1 it turns out that the driver currently only keys off the
compatible string "qcom,ufshc" so it won't actually probe.

For #2 the driver won't probe but it's not a good idea to keep the SoC
name out of the compatible string.

Let's update the compatible string to make it really explicit.  We'll
include a nod to the existing driver and the old binding and say that
we should always include the "qcom,ufshc" string in addition to the
SoC compatible string.

While we're at it we'll also include another example SoC known to have
UFS: sdm845.

Fixes: 47555a5c8a11 ("scsi: ufs: make the UFS variant a platform device")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ata: Use unsigned int for cmd's type in ioctls in scsi_host_template

Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't fit into type 'int', which is used for the cmd parameter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect a negative value for
any command, it brings the ioctls inline with the reset of the kernel,
and it removes ambiguity, which is never good when dealing with compilers.

Link: https://github.com/ClangBuiltLinux/linux/issues/85
Link: https://github.com/ClangBuiltLinux/linux/issues/154
Link: https://github.com/ClangBuiltLinux/linux/issues/157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Bradley Grove <bgrove@attotech.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Update lpfc version to 12.2.0.0

Update lpfc version to 12.2.0.0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Update 12.2.0.0 file copyrights to 2019

For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix nvmet issues when link bounce under IO load

Various null pointer dereference and general protection fault panics occur
when there is a link bounce under load. There are a large number of "error"
message 6413 indicating "bad release".

The issues resolve to list corruptions due to missing or inconsistent lock
protection. Lockups are due to nested locks in the unsolicited abort
path. The unsolicited abort path calls the wrong abort processing
routine. There was also duplicate context release while aborts were still
active in the hardware.

Removed duplicate locks and added lock protection around list item
removal. Commonized lock handling around the abort processing routines.
Prevent context release while still in ABTS list.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall

When the transport calls into the lpfc target to release an IO job
structure, which corresponds to an exchange, and if the driver was waiting
for an exchange in order to post a previously received command to the
transport, the driver immediately takes the IO job and reuses the context
for the prior command and calls nvmet_fc_rcv_fcp_req() to tell the
transport about a newly received command.

Problem is, the execution of the IO job release may be in the context of
the back end driver and its bio completion handlers, thus it may be in a
irq context and protection code kicks in in the bio and request layers that
are subsequently called.

Rework lpfc so that instead of immediately upcalling, queue it to a
deferred work thread and have the thread make the upcall.

Took advantage of this change to remove duplicated code with the normal
command receive path that preps the IO job and upcalls nvmet_fc. Created a
common routine both paths use.

Also corrected some errors that were found during review of the context
freeing and reuse - basically unlocked operations and a somewhat disjoint
set of calls to release associated job elements. Cleaned up this path and
added locks for coherency.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix default driver parameter collision for allowing NPIV support

The conversion to enable SCSI and NVME fc4 support ran into an issue with
NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
was. The driver reverted to its lowest setting meaning NPIV with SCSI was
not allowed.

Convert the NPIV checks and implementation so that SCSI can continue to
allow NPIV support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Rework locking on SCSI io completion

A scsi host lock is taken on every io completion to check whether the abort
handler is waiting on the io completion. This is an expensive lock to take
on all completion when rarely in an abort condition.

Replace scsi host lock with command-specific lock. Synchronize completion
and abort paths by new cmd lock. Ensure all flag changing and nulling of
context pointers taken under lock. When adding lock to task management
abort, realized it was missing other synchronization locks. Added that
synchronization to match normal paths.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Enable SCSI and NVME fc4s by default

Now that performance mods don't split resources by protocol and enable both
protocols by default, there's no reason not to enable concurrent SCSI and
NVME fc4 support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Resize cpu maps structures based on possible cpus

The work done to date utilized the number of present cpus when sizing
per-cpu structures. Structures should have been sized based on the max
possible cpu count.

Convert the driver over to possible cpu count for sizing allocation.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Utilize new IRQ API when allocating MSI-X vectors

Current driver uses the older IRQ API for MSIX allocation

Change driver to utilize pci_alloc_irq_vectors when allocating IRQ vectors.

Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the
kernel mapped all the IRQs.

Remove msix_entries from SLI4 structure, replaced with pci_irq_vector()
usage.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing

When driving high iop counts, auto_imax coalescing kicks in and drives the
performance to extremely small iops levels.

There are two issues:

1) auto_imax is enabled by default. The auto algorithm, when iops gets
    high, divides the iops by the hdwq count and uses that value to
    calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
    they have load or not. The EQ_delay is only manipulated every 5s (a
    long time). Thus there were large 5s swings of no interrupt delay
    followed by large/maximum delay, before repeating.

2) When processing a CQ, the driver got mixed up on the rate of when
    to ring the doorbell to keep the chip appraised of the eqe or cqe
    consumption as well as how how long to sit in the thread and
    process queue entries. Currently, the driver capped its work at
    64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
    loads, additional overheads were taken to exit and re-enter the
    interrupt handler. Worse, if in the large/maximum coalescing
    windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:

- A change in defaults. Auto_imax is turned OFF and fcp_imax is set
   to 0. Thus all interrupts are immediate.

- Cleanup of field names and their meanings. Existing names were
   non-intuitive or used for duplicate things.

- Added max_proc_limit field, to control the length of time the
   handlers would service completions.

- Reworked EQ handling:
    Added common routine that walks eq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after eqe
      processing.
    After rework, xx_release routines are now DB write functions. Renamed
      the routines as such.
    Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
    Replaced the 2 individual loops that walk an eq with a call to the
      common routine.
    Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
    Added per-cpu counters to detect interrupt rates and scale
      interrupt coalescing values.

- Reworked CQ handling:
    Added common routine that walks cq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after cqe
      processing.
    After rework, xx_release routines are now DB write functions.  Renamed
      the routines as such.
    Replaced the 3 individual loops that walk a cq with a call to the
      common routine.
    Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
      queue reference. Add increment for mbox completion to handler.

- Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
   dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads will
avoid generating more interrupts as they will only rearm as the processing
thread exits. The immediately interrupt is also beneficial to idle or
lower-processing CQ's as they get serviced immediately without being
penalized by sharing an EQ with a more loaded CQ.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: cleanup: convert eq_delay to usdelay

Review of the eq coalescing logic showed the code was a bit fragmented.
Sometimes it would save/set via an interrupt max value, while in others it
would do so via a usdelay. There were also two places changing eq delay,
one place that issued mailbox commands, and another that changed via
register writes if supported.

Clean this up by:

- Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
   that it is always told of a us delay to impose. The routine then chooses
   the best way to set that - via register or via mbx.

- Rather than two value types stored in eq->q_mode (usdelay if change via
   register, imax if change via mbox) - q_mode always contains usdelay.
   Before any value change, old vs new value is compared and only if
   different is a change done.

- Revised the dmult calculation. dmult is not set based on overall imax
   divided by hardware queues - instead imax applies to a single cpu and
   the value will be replicated to all cpus.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues

So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix setting affinity hints to correlate with hardware queues

The desired affinity for the hardware queue behavior is for hdwq 0 to be
affinitized with cpu 0, hdwq 1 to cpu 1, and so on. The implementation so
far does not do this if the number of cpus is greater than the number of
hardware queues (e.g. hardware queue allocation was administratively
reduced or hardware queue resources could not scale to the cpu count).

Correct the queue affinitization logic when queue count is less than
cpu count.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Allow override of hardware queue selection policies

Default behavior is to use the information from the upper IO stacks to
select the hardware queue to use for IO submission. Which typically has
good cpu affinity.

However, the driver, when used on some variants of the upstream kernel, has
found queuing information to be suboptimal for FCP or IO completion locked
on particular cpus.

For command submission situations, the lpfc_fcp_io_sched module parameter
can be set to specify a hardware queue selection policy that overrides the
os stack information.

For IO completion situations, rather than queing cq processing based on the
cpu servicing the interrupting event, schedule the cq processing on the cpu
associated with the hardware queue's cq.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Adapt partitioned XRI lists to efficient sharing

The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool. The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool. A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Synchronize hardware queues with SCSI MQ interface

Now that the lower half has much better per-cpu parallelization using the
hardware queues, the SCSI MQ support needs to be tied into it.

The involves the following mods:

- Use the hardware queue info from the midlayer to help select the
hardware queue to utilize. This required change to the get_scsi-buf_xxx
routines.

- Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed.

- Includes fix for SLI-3 that does not have multi queue parallelization.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting.

SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
hardware. This should be indicating the hardware queue to use, not the ring
number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures

Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures. Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>