OSDN Git Service

uclinux-h8/linux.git
10 years agoMerge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma...
Roland Dreier [Thu, 3 Apr 2014 15:30:17 +0000 (08:30 -0700)]
Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next

10 years agoRDMA/ocrdma: Unregister inet notifier when unloading ocrdma
Selvin Xavier [Tue, 18 Mar 2014 09:24:56 +0000 (14:54 +0530)]
RDMA/ocrdma: Unregister inet notifier when unloading ocrdma

Unregister the inet notifier during ocrdma unload to avoid a panic after
driver unload.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Fix warnings about pointer <-> integer casts
Roland Dreier [Tue, 18 Mar 2014 06:14:17 +0000 (23:14 -0700)]
RDMA/ocrdma: Fix warnings about pointer <-> integer casts

We should cast pointers to and from unsigned long to turn them into ints.

Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Code clean-up
Devesh Sharma [Tue, 4 Feb 2014 06:27:10 +0000 (11:57 +0530)]
RDMA/ocrdma: Code clean-up

Clean up code.  Also modifying GSI QP to error during ocrdma_close is fixed.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Display FW version
Selvin Xavier [Tue, 4 Feb 2014 06:27:09 +0000 (11:57 +0530)]
RDMA/ocrdma: Display FW version

Adding a sysfs file for getting the FW version.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Query controller information
Selvin Xavier [Tue, 4 Feb 2014 06:27:07 +0000 (11:57 +0530)]
RDMA/ocrdma: Query controller information

Issue mailbox commands to query ocrdma controller information and phy
information and print them while adding ocrdma device.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Support non-embedded mailbox commands
Selvin Xavier [Tue, 4 Feb 2014 06:27:06 +0000 (11:57 +0530)]
RDMA/ocrdma: Support non-embedded mailbox commands

Added a routine to issue non-embedded mailbox commands for handling
large mailbox request/response data.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Handle CQ overrun error
Selvin Xavier [Tue, 4 Feb 2014 06:27:05 +0000 (11:57 +0530)]
RDMA/ocrdma: Handle CQ overrun error

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Display proper value for max_mw
Selvin Xavier [Tue, 4 Feb 2014 06:27:04 +0000 (11:57 +0530)]
RDMA/ocrdma: Display proper value for max_mw

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Use non-zero tag in SRQ posting
Selvin Xavier [Tue, 4 Feb 2014 06:27:03 +0000 (11:57 +0530)]
RDMA/ocrdma: Use non-zero tag in SRQ posting

As part of SRQ receive buffers posting we populate a non-zero tag
which will be returned in SRQ receive completions.

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr()
Selvin Xavier [Tue, 4 Feb 2014 06:27:02 +0000 (11:57 +0530)]
RDMA/ocrdma: Memory leak fix in ocrdma_dereg_mr()

Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Increment abi version count
Devesh Sharma [Tue, 4 Feb 2014 06:27:01 +0000 (11:57 +0530)]
RDMA/ocrdma: Increment abi version count

Increment the ABI version count for driver/library interface.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Update version string
Devesh Sharma [Tue, 4 Feb 2014 06:27:00 +0000 (11:57 +0530)]
RDMA/ocrdma: Update version string

Update the driver vrsion string and node description string

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agobe2net: Add abi version between be2net and ocrdma
Devesh Sharma [Tue, 4 Feb 2014 06:26:59 +0000 (11:56 +0530)]
be2net: Add abi version between be2net and ocrdma

This patch adds abi versioning between be2net and ocrdma driver modules
to catch functional incompatibilities in the two drivers.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: ABI versioning between ocrdma and be2net
Devesh Sharma [Tue, 4 Feb 2014 06:26:58 +0000 (11:56 +0530)]
RDMA/ocrdma: ABI versioning between ocrdma and be2net

While loading RoCE driver be2net driver should check for ABI version
to catch functional incompatibilities.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Allow DPP QP creation
Devesh Sharma [Tue, 4 Feb 2014 06:26:57 +0000 (11:56 +0530)]
RDMA/ocrdma: Allow DPP QP creation

Allow creating DPP QP even if inline-data is not requested.  This is an
optimization to lower latency.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Read ASIC_ID register to select asic_gen
Devesh Sharma [Tue, 4 Feb 2014 06:26:56 +0000 (11:56 +0530)]
RDMA/ocrdma: Read ASIC_ID register to select asic_gen

ocrdma driver selects execution path based on sli_family and asic
generation number.  This introduces code to read the asic gen number
from pci register instead of obtaining it from the Emulex NIC driver.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: SQ and RQ doorbell offset clean up
Devesh Sharma [Tue, 4 Feb 2014 06:26:55 +0000 (11:56 +0530)]
RDMA/ocrdma: SQ and RQ doorbell offset clean up

Introducing new macros to define SQ and RQ doorbell offset.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: EQ full catastrophe avoidance
Devesh Sharma [Tue, 4 Feb 2014 06:26:54 +0000 (11:56 +0530)]
RDMA/ocrdma: EQ full catastrophe avoidance

Stale entries in the CQ being destroyed causes hardware to generate
EQEs indefinitely for a given CQ.  Thus causing uncontrolled execution
of irq_handler.  This patch fixes this using following sementics:

    * irq_handler will ring EQ doorbell atleast once and implement budgeting scheme.
    * cq_destroy will count number of valid entires during destroy and ring
      cq-db so that hardware does not generate uncontrolled EQE.
    * cq_destroy will synchronize with last running irq_handler instance.
    * arm_cq will always defer arming CQ till poll_cq, except for the first arm_cq call.
    * poll_cq will always ring cq-db with arm=SET if arm_cq was called prior to enter poll_cq.
    * poll_cq will always ring cq-db with arm=UNSET if arm_cq was not called prior to enter poll_cq.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Selvin Xavier <selvin.xavier@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Disable DSGL use by default
Steve Wise [Thu, 27 Mar 2014 17:03:47 +0000 (12:03 -0500)]
RDMA/cxgb4: Disable DSGL use by default

Current hardware doesn't correctly support DSGL.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: rx_data() needs to hold the ep mutex
Steve Wise [Fri, 21 Mar 2014 15:10:37 +0000 (20:40 +0530)]
RDMA/cxgb4: rx_data() needs to hold the ep mutex

To avoid racing with other threads doing close/flush/whatever, rx_data()
should hold the endpoint mutex.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Drop RX_DATA packets if the endpoint is gone
Steve Wise [Fri, 21 Mar 2014 15:10:36 +0000 (20:40 +0530)]
RDMA/cxgb4: Drop RX_DATA packets if the endpoint is gone

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Lock around accept/reject downcalls
Steve Wise [Fri, 21 Mar 2014 15:10:35 +0000 (20:40 +0530)]
RDMA/cxgb4: Lock around accept/reject downcalls

There is a race between ULP threads doing an accept/reject, and the
ingress processing thread handling close/abort for the same connection.
The accept/reject path needs to hold the lock to serialize these paths.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
[ Fold in locking fix found by Dan Carpenter <dan.carpenter@oracle.com>.
  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler
Moni Shoua [Thu, 27 Mar 2014 08:52:58 +0000 (10:52 +0200)]
IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler

The code that resolves the passive side source MAC within the rdma_cm
connection request handler was both redundant and buggy, so remove it.

It was redundant since later, when an RC QP is modified to RTR state,
the resolution will take place in the ib_core module.  It was buggy
because this callback also deals with UD SIDR exchange, for which we
incorrectly looked at the REQ member of the CM event and dereferenced
a random value.

Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/core: Remove overload in ib_sg_dma*
Mike Marciniszyn [Fri, 28 Mar 2014 17:26:59 +0000 (13:26 -0400)]
IB/core: Remove overload in ib_sg_dma*

The code is replaced by driver specific changes and avoids the pointer
NULL test for drivers that don't overload these operations.

Suggested-by: <Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Vinod Kumar <vinod.kumar@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads
Mike Marciniszyn [Fri, 28 Mar 2014 17:26:53 +0000 (13:26 -0400)]
IB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads

These methods appear to only mimic the sg_dma_address() and
sg_dma_len() behavior.

They can be safely removed.

Suggested-by: Bart Van Assche <bvanassche@acm.org>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads
Mike Marciniszyn [Fri, 28 Mar 2014 19:04:43 +0000 (15:04 -0400)]
IB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads

The removal of these methods is compensated for by code changes to
.map_sg to insure that the vanilla sg_dma_address() and sg_dma_len()
will do the same thing as the equivalent former ib_sg_dma_address()
and ib_sg_dma_len() calls into the drivers.

The introduction of this patch required that the struct
ipath_dma_mapping_ops be converted to a C99 initializer.

Suggested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads
Mike Marciniszyn [Fri, 28 Mar 2014 17:26:42 +0000 (13:26 -0400)]
IB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads

Remove the overload for .dma_len and .dma_address

The removal of these methods is compensated for by code changes to
.map_sg to insure that the vanilla sg_dma_address() and sg_dma_len()
will do the same thing as the equivalent former ib_sg_dma_address()
and ib_sg_dma_len() calls into the drivers.

Suggested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Vinod Kumar <vinod.kumar@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Bump driver version to 1.3
Or Gerlitz [Tue, 1 Apr 2014 13:28:42 +0000 (16:28 +0300)]
IB/iser: Bump driver version to 1.3

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Update Mellanox copyright note
Or Gerlitz [Tue, 1 Apr 2014 13:28:41 +0000 (16:28 +0300)]
IB/iser: Update Mellanox copyright note

Update Mellanox copyrights for 2014 on the iser initiator driver.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Print QP information once connection is established
Or Gerlitz [Tue, 1 Apr 2014 13:28:40 +0000 (16:28 +0300)]
IB/iser: Print QP information once connection is established

Add an iser info print with the local/remote QP information carried
out when the connection is established.  While here, fix a little
leftover from the T10 work and set a debug print to be carried in
debug and not info level.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Remove struct iscsi_iser_conn
Ariel Nahum [Tue, 1 Apr 2014 13:28:39 +0000 (16:28 +0300)]
IB/iser: Remove struct iscsi_iser_conn

The iscsi stack has existing mechanisms to link back and forth between
the iscsi connection and the iscsi transport (e.g iser/tcp) connection.

This is done through a dd_data pointer field in struct iscsi_conn
which can be set to point to the transport connection, etc.

The iscsi_iser_conn structure was used to get this linking done in
another way, which is uneeded and adds extra complication to the iser
code, so we just remove it.

Signed-off-by: Ariel Nahum <arieln@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Drain the tx cq once before looping on the rx cq
Roi Dayan [Tue, 1 Apr 2014 13:28:38 +0000 (16:28 +0300)]
IB/iser: Drain the tx cq once before looping on the rx cq

The iser disconnection flow isn't done before all the inflight
recv/send buffers posted to the QP are either flushed or normally
completed to the CQ that serves this connection.  The condition check
is done in iser_handle_comp_error().

Currently, it's possible for the send buffer completion that makes the
posted send buffers counter reach zero to be polled in the drain tx
call, which is after the rx cq is fully drained.  Since this
completion might be not an error one (for example, it might be a
completion of the logout request iSCSI PDU) we will skip
iser_handle_comp_error().  So the connection will never terminate from
the iscsi stack point of view, and we hang.

To resolve this race, do the draining of the tx cq before the loop on
the rx cq.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Fix sector_t format warning
Randy Dunlap [Fri, 28 Mar 2014 21:03:40 +0000 (14:03 -0700)]
IB/iser: Fix sector_t format warning

Fix pr_err (printk) format warning:

    drivers/infiniband/ulp/iser/iser_verbs.c:1181:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'sector_t' [-Wformat]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agomlx4_core: Make buffer larger to avoid overflow warning
Dan Carpenter [Fri, 28 Mar 2014 08:23:07 +0000 (11:23 +0300)]
mlx4_core: Make buffer larger to avoid overflow warning

My static checker complains that the sprintf() here can overflow.

drivers/infiniband/hw/mlx4/main.c:1836 mlx4_ib_alloc_eqs()
error: format string overflow. buf_size: 32 length: 69

This seems like a valid complaint.  The "dev->pdev->bus->name" string
can be 48 characters long.  I just made the buffer 80 characters instead
of 69 and I changed the sprintf() to snprintf().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agomlx4_core: Fix some indenting in mlx4_ib_add()
Dan Carpenter [Fri, 28 Mar 2014 08:21:39 +0000 (11:21 +0300)]
mlx4_core: Fix some indenting in mlx4_ib_add()

The code was indented too far and also kernel style says we should have
curly braces.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mad: Check and handle potential DMA mapping errors
Yan Burman [Tue, 11 Mar 2014 12:41:47 +0000 (14:41 +0200)]
IB/mad: Check and handle potential DMA mapping errors

Running with DMA_API_DEBUG enabled and not checking for DMA mapping
errors triggers a kernel stack trace with "DMA-API: device driver
failed to check map error" message.  Add these checks to the MAD
module, both to be be more robust and also eliminate these
false-positive stack traces.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/ehca: Returns an error on ib_copy_to_udata() failure
Yann Droneaud [Mon, 10 Mar 2014 22:06:25 +0000 (23:06 +0100)]
IB/ehca: Returns an error on ib_copy_to_udata() failure

In case of error when writing to userspace, function ehca_create_cq()
does not set an error code before following its error path.

This patch sets the error code to -EFAULT when ib_copy_to_udata()
fails.

This was caught when using spatch (aka. coccinelle)
to rewrite call to ib_copy_{from,to}_udata().

Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mthca: Return an error on ib_copy_to_udata() failure
Yann Droneaud [Mon, 10 Mar 2014 22:06:26 +0000 (23:06 +0100)]
IB/mthca: Return an error on ib_copy_to_udata() failure

In case of error when writing to userspace, the function mthca_create_cq()
does not set an error code before following its error path.

This patch sets the error code to -EFAULT when ib_copy_to_udata() fails.

This was caught when using spatch (aka. coccinelle)
to rewrite call to ib_copy_{from,to}_udata().

Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Update snd_seq when sending MPA messages
Steve Wise [Fri, 21 Mar 2014 15:10:34 +0000 (20:40 +0530)]
RDMA/cxgb4: Update snd_seq when sending MPA messages

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Connect_request_upcall fixes
Steve Wise [Fri, 21 Mar 2014 15:10:33 +0000 (20:40 +0530)]
RDMA/cxgb4: Connect_request_upcall fixes

When processing an MPA Start Request, if the listening endpoint is
DEAD, then abort the connection.

If the IWCM returns an error, then we must abort the connection and
release resources.  Also abort_connection() should not post a CLOSE
event, so clean that up too.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Ignore read reponse type 1 CQEs
Steve Wise [Fri, 21 Mar 2014 15:10:32 +0000 (20:40 +0530)]
RDMA/cxgb4: Ignore read reponse type 1 CQEs

These are generated by HW in some error cases and need to be
silently discarded.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Fix possible memory leak in RX_PKT processing
Steve Wise [Fri, 21 Mar 2014 15:10:31 +0000 (20:40 +0530)]
RDMA/cxgb4: Fix possible memory leak in RX_PKT processing

If cxgb4_ofld_send() returns < 0, then send_fw_pass_open_req() must
free the request skb and the saved skb with the tcp header.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Don't leak skb in c4iw_uld_rx_handler()
Steve Wise [Fri, 21 Mar 2014 15:10:30 +0000 (20:40 +0530)]
RDMA/cxgb4: Don't leak skb in c4iw_uld_rx_handler()

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Fix a race condition between failing I/O and I/O completion
Bart Van Assche [Fri, 14 Mar 2014 12:54:11 +0000 (13:54 +0100)]
IB/srp: Fix a race condition between failing I/O and I/O completion

Avoid that srp_terminate_io() can access req->scmnd after it has been
cleared by the I/O completion code. Do this by protecting req->scmnd
accesses from srp_terminate_io() via locking

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Avoid that writing into "add_target" hangs due to a cable pull
Bart Van Assche [Fri, 14 Mar 2014 12:53:40 +0000 (13:53 +0100)]
IB/srp: Avoid that writing into "add_target" hangs due to a cable pull

If a cable is pulled while srp_connect_target() is in progress
that can result in that function never to return. That makes the
process, e.g. srp_daemon, that invoked this function unkillable.
Avoid this by letting srp_connect_target() finish if the event
IB_CM_TIMEWAIT_EXIT is received. This patch fixes a hang with the
following call trace:

 [<ffffffff814eae85>] schedule_timeout+0x215/0x2e0
 [<ffffffff814eab03>] wait_for_common+0x123/0x180
 [<ffffffff814eac1d>] wait_for_completion+0x1d/0x20
 [<ffffffffa03b398c>] srp_connect_target+0x1dc/0x410 [ib_srp]
 [<ffffffffa03b5809>] srp_create_target+0xba9/0xe70 [ib_srp]
 [<ffffffff8133e590>] dev_attr_store+0x20/0x30
 [<ffffffff811eb8f5>] sysfs_write_file+0xe5/0x170
 [<ffffffff811767c8>] vfs_write+0xb8/0x1a0
 [<ffffffff811770c1>] sys_write+0x51/0x90
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Make writing into the "add_target" sysfs attribute interruptible
Bart Van Assche [Fri, 14 Mar 2014 12:53:10 +0000 (13:53 +0100)]
IB/srp: Make writing into the "add_target" sysfs attribute interruptible

Avoid that stopping srp_daemon takes unusually long due to a cable
pull by making writing into the "add_target" sysfs attribute
interruptible.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Avoid duplicate connections
Bart Van Assche [Fri, 14 Mar 2014 12:52:45 +0000 (13:52 +0100)]
IB/srp: Avoid duplicate connections

The connection uniqueness check is performed before a new connection
is added to the target list. This patch protects both actions by a
mutex such that simultaneous writes from two different threads into the
"add_target" variable do not result in duplicate connections.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Add more logging
Bart Van Assche [Fri, 14 Mar 2014 12:52:21 +0000 (13:52 +0100)]
IB/srp: Add more logging

Log sgid and dgid when reporting that a login has been rejected or when
a host has been added. This makes it easy to figure out which initiator
and target ports these messages apply to.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/srp: Check ib_query_gid return value
Sagi Grimberg [Fri, 14 Mar 2014 12:51:58 +0000 (13:51 +0100)]
IB/srp: Check ib_query_gid return value

Detected by Coverity.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoscsi_transport_srp: Fix two kernel-doc warnings
Bart Van Assche [Fri, 14 Mar 2014 12:51:19 +0000 (13:51 +0100)]
scsi_transport_srp: Fix two kernel-doc warnings

This patch fixes the following two kernel-doc warnings:

    Warning(drivers/scsi/scsi_transport_srp.c:819): No description found for parameter 'rport'
    Warning(include/scsi/scsi_transport_srp.h:75): Excess struct/union/enum/typedef member 'deleted' description in 'srp_rport'

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Cleanup qib_register_observer()
Dan Carpenter [Thu, 30 Jan 2014 12:12:31 +0000 (15:12 +0300)]
IB/qib: Cleanup qib_register_observer()

Returning directly is easier to read than do-nothing gotos.  Remove the
duplicative check on "olp" and pull the code in one indent level.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Change SDMA progression mode depending on single- or multi-rail
CQ Tang [Thu, 30 Jan 2014 22:36:00 +0000 (17:36 -0500)]
IB/qib: Change SDMA progression mode depending on single- or multi-rail

Improve performance by changing the behavour of the driver when all
SDMA descriptors are in use, and the processes adding new descriptors
are single- or multi-rail.

For single-rail processes, the driver will block the call and finish
posting all SDMA descriptors onto the hardware queue before returning
back to PSM.  Repeated kernel calls are slower than blocking.

For multi-rail processes, the driver will return to PSM as quick as
possible so PSM can feed packets to other rail.  If all hardware
queues are full, PSM will buffer the remaining SDMA descriptors until
notified by interrupt that space is available.

This patch builds a red-black tree to track the number rails opened by
a particular PID. If the number is more than one, it is a multi-rail
PSM process, otherwise, it is a single-rail process.

Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: John A Gregor <john.a.gregor@intel.com>
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Save the correct map length for fast_reg_page_lists
Steve Wise [Wed, 19 Mar 2014 12:14:45 +0000 (17:44 +0530)]
RDMA/cxgb4: Save the correct map length for fast_reg_page_lists

We cannot save the mapped length using the rdma max_page_list_len field
of the ib_fast_reg_page_list struct because the core code uses it.  This
results in an incorrect unmap of the page list in c4iw_free_fastreg_pbl().

I found this with dma mapping debugging enabled in the kernel.  The
fix is to save the length in the c4iw_fr_page_list struct.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Default peer2peer mode to 1
Steve Wise [Wed, 19 Mar 2014 12:14:44 +0000 (17:44 +0530)]
RDMA/cxgb4: Default peer2peer mode to 1

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Mind the sq_sig_all/sq_sig_type QP attributes
Steve Wise [Wed, 19 Mar 2014 12:14:43 +0000 (17:44 +0530)]
RDMA/cxgb4: Mind the sq_sig_all/sq_sig_type QP attributes

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Fix incorrect BUG_ON conditions
Steve Wise [Wed, 19 Mar 2014 12:14:42 +0000 (17:44 +0530)]
RDMA/cxgb4: Fix incorrect BUG_ON conditions

Based on original work from Jay Hernandez <jay@chelsio.com>

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Always release neigh entry
Steve Wise [Wed, 19 Mar 2014 12:14:40 +0000 (17:44 +0530)]
RDMA/cxgb4: Always release neigh entry

Always release the neigh entry in rx_pkt().

Based on original work by Santosh Rastapur <santosh@chelsio.com>.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Allow loopback connections
Steve Wise [Wed, 19 Mar 2014 12:14:39 +0000 (17:44 +0530)]
RDMA/cxgb4: Allow loopback connections

find_route() must treat loopback as a valid egress interface.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Cap CQ size at T4_MAX_IQ_SIZE
Steve Wise [Wed, 19 Mar 2014 12:14:38 +0000 (17:44 +0530)]
RDMA/cxgb4: Cap CQ size at T4_MAX_IQ_SIZE

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()
Dan Carpenter [Sat, 19 Oct 2013 09:14:35 +0000 (12:14 +0300)]
RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()

There is a four byte hole at the end of the "uresp" struct after the
->qid_mask member.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/cxgb4: Fix underflows in c4iw_create_qp()
Dan Carpenter [Sat, 19 Oct 2013 09:14:12 +0000 (12:14 +0300)]
RDMA/cxgb4: Fix underflows in c4iw_create_qp()

These sizes should be unsigned so we don't allow negative values and
have underflow bugs.  These can come from the user so there may be
security implications, but I have not tested this.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Publish T10-PI support to SCSI midlayer
Sagi Grimberg [Wed, 5 Mar 2014 17:43:51 +0000 (19:43 +0200)]
IB/iser: Publish T10-PI support to SCSI midlayer

After allocating a scsi_host we set protection types and guard type
supported.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Implement check_protection
Sagi Grimberg [Wed, 5 Mar 2014 17:43:50 +0000 (19:43 +0200)]
IB/iser: Implement check_protection

Once the iSCSI transaction is completed we must implement
check_protection in order to notify on DIF errors that may have
occured.

The routine boils down to calling ib_check_mr_status to get the
signature status of the transaction.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoSCSI/libiscsi: Add check_protection callback for transports
Sagi Grimberg [Wed, 5 Mar 2014 17:43:49 +0000 (19:43 +0200)]
SCSI/libiscsi: Add check_protection callback for transports

iSCSI needs to be at least aware that a task involves protection
information.  In case it does, after the transaction completed libiscsi
will ask the transport to check the protection status of the
transaction.

Unlike transport errors, DIF errors should not prevent successful
completion of the transaction from the transport point of view, but
should be escelated to scsi mid-layer when constructing the scsi
result and sense data.

check_protection routine will return the ascq corresponding to the DIF
error that occured (or 0 if no error happened).

return ascq:
- 0x1: GUARD_CHECK_FAILED
- 0x2: APPTAG_CHECK_FAILED
- 0x3: REFTAG_CHECK_FAILED

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Support T10-PI operations
Sagi Grimberg [Wed, 5 Mar 2014 17:43:48 +0000 (19:43 +0200)]
IB/iser: Support T10-PI operations

Add logic to initialize protection information entities.  Upon each
iSCSI task, we keep the scsi_cmnd in order to query the scsi
protection operations and reference to protection buffers.

Modify iser_fast_reg_mr to receive indication whether it is
registering the data or protection buffers.

In addition introduce iser_reg_sig_mr which performs fast registration
work-request for a signature enabled memory region
(IB_WR_REG_SIG_MR).  In this routine we set all the protection
relevants for the device to offload protection data-transfer and
verification.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Initialize T10-PI resources
Alex Tabachnik [Wed, 5 Mar 2014 17:43:47 +0000 (19:43 +0200)]
IB/iser: Initialize T10-PI resources

During connection establishment we also initialize T10-PI resources
(QP, PI contexts) in order to support SCSI's protection operations.

Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Introduce pi_enable, pi_guard module parameters
Alex Tabachnik [Wed, 5 Mar 2014 17:43:46 +0000 (19:43 +0200)]
IB/iser: Introduce pi_enable, pi_guard module parameters

Use modparams to activate protection information support.

pi_enable bool: Based on this parameter iSER will know if it should
support T10-PI.  We don't want to do this by default as it requires to
allocate and initialize extra resources.  In case pi_enable=N, iSER
won't publish to SCSI midlayer any DIF capabilities.

pi_guard int: Based on this parameter iSER will publish DIX guard type
support to SCSI midlayer.  0 means CRC is allowed to be passed in DIX
buffers, 1 (or non-zero) means IP-CSUM is allowed to be passed in DIX
buffers.  Note that over the wire, only CRC is allowed.

In the next phase, it is worth considering passing these parameters
from iscsid via nlmsg.  This will allow these parameters to be
connection based rather than global.

Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Generalize fall_to_bounce_buf routine
Sagi Grimberg [Wed, 5 Mar 2014 17:43:45 +0000 (19:43 +0200)]
IB/iser: Generalize fall_to_bounce_buf routine

Unaligned SG-lists may also happen for protection information.
Generalize bounce buffer routine to handle any iser_data_buf which may
be data and/or protection.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Generalize iser_unmap_task_data and finalize_rdma_unaligned_sg
Sagi Grimberg [Wed, 5 Mar 2014 17:43:44 +0000 (19:43 +0200)]
IB/iser: Generalize iser_unmap_task_data and finalize_rdma_unaligned_sg

This routines operates on data buffers and may also work with
protection infomation buffers.  So we generalize them to handle an
iser_data_buf which can be the command data or command protection
information.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Replace fastreg descriptor valid bool with indicators container
Sagi Grimberg [Wed, 5 Mar 2014 17:43:43 +0000 (19:43 +0200)]
IB/iser: Replace fastreg descriptor valid bool with indicators container

In T10-PI support we will have memory keys for protection buffers and
signature transactions.  We prefer to compact indicators rather than
keeping multiple bools.

This commit does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Keep IB device attributes under iser_device
Sagi Grimberg [Wed, 5 Mar 2014 17:43:42 +0000 (19:43 +0200)]
IB/iser: Keep IB device attributes under iser_device

For T10-PI offload support, we will need to know the device signature
offload capability upon every connection establishment.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Move fast_reg_descriptor initialization to a function
Sagi Grimberg [Wed, 5 Mar 2014 17:43:41 +0000 (19:43 +0200)]
IB/iser: Move fast_reg_descriptor initialization to a function

fastreg descriptor will include protection information context.  In
order to place the logic in one place we introduce iser_create_fr_desc
function.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Push the decision what memory key to use into fast_reg_mr routine
Sagi Grimberg [Wed, 5 Mar 2014 17:43:40 +0000 (19:43 +0200)]
IB/iser: Push the decision what memory key to use into fast_reg_mr routine

This is a preparation step for T10-PI offload support.  We prefer to
push the desicion of which mkey to use (global or fastreg) to
iser_fast_reg_mr.  We choose to do this since in T10-PI we may need to
register for protection buffers and in this case we wish to simplify
iser_fast_reg_mr instead of repeating the logic of which key to use.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Avoid FRWR notation, use fastreg instead
Sagi Grimberg [Wed, 5 Mar 2014 17:43:39 +0000 (19:43 +0200)]
IB/iser: Avoid FRWR notation, use fastreg instead

FRWR stands for "fast registration work request". We want to avoid
calling the fastreg pool with that name, instead we name it fastreg
which stands for "fast registration".

This pool will include more elements in the future, so it is a good
idea to generalize the name.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Alex Tabachnik <alext@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/iser: Suppress completions for fast registration work requests
Sagi Grimberg [Thu, 23 Jan 2014 10:31:28 +0000 (12:31 +0200)]
IB/iser: Suppress completions for fast registration work requests

In case iSER uses fast registration method, it should not request for
successful completions on fast registration nor local invalidate
requests.  We color wr_id with ISER_FRWR_LI_WRID in order to correctly
consume error completions.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx4: Fix a sparse endianness warning
Bart Van Assche [Mon, 10 Mar 2014 09:33:05 +0000 (10:33 +0100)]
IB/mlx4: Fix a sparse endianness warning

Fix the following warning for the mlx4 driver:

    $ make M=drivers/infiniband C=2 CF=-D__CHECK_ENDIAN__
    drivers/infiniband/hw/mlx4/qp.c:1885:31: warning: restricted __be16 degrades to integer

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/ocrdma: Fix compiler warning
Prarit Bhargava [Wed, 19 Feb 2014 20:05:16 +0000 (15:05 -0500)]
RDMA/ocrdma: Fix compiler warning

drivers/infiniband/hw/ocrdma/ocrdma_verbs.c: In function ‘_ocrdma_modify_qp’:
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:1299:31: error: ‘old_qps’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  status = ocrdma_mbx_modify_qp(dev, qp, attr, attr_mask, old_qps);

ocrdma_mbx_modify_qp() (and subsequent calls) doesn't appear to use old_qps
so it doesn't need to be passed on.  Removing the variable results in the
warning going away.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Devesh Sharma (Devesh.sharma@emulex.com)
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/nes: Clean up a condition
Dan Carpenter [Thu, 6 Feb 2014 12:48:23 +0000 (15:48 +0300)]
RDMA/nes: Clean up a condition

We don't need to test "ret" twice and also the white space is messed up.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Remove duplicate check in get_a_ctxt()
Dan Carpenter [Thu, 13 Feb 2014 11:09:37 +0000 (14:09 +0300)]
IB/qib: Remove duplicate check in get_a_ctxt()

We already know "pusable" is non-zero, no need to check again.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/usnic: Remove '0x' when using %pa format
Fabio Estevam [Tue, 18 Feb 2014 12:54:27 +0000 (09:54 -0300)]
IB/usnic: Remove '0x' when using %pa format

%pa format already prints in hexadecimal format, so remove the '0x' annotation
to avoid a double '0x0x' pattern.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/nes: Return an error on ib_copy_from_udata() failure instead of NULL
Yann Droneaud [Mon, 10 Mar 2014 22:06:27 +0000 (23:06 +0100)]
IB/nes: Return an error on ib_copy_from_udata() failure instead of NULL

In case of error while accessing to userspace memory, function
nes_create_qp() returns NULL instead of an error code wrapped through
ERR_PTR().  But NULL is not expected by ib_uverbs_create_qp(), as it
check for error with IS_ERR().

As page 0 is likely not mapped, it is going to trigger an Oops when
the kernel will try to dereference NULL pointer to access to struct
ib_qp's fields.

In some rare cases, page 0 could be mapped by userspace, which could
turn this bug to a vulnerability that could be exploited: the function
pointers in struct ib_device will be under userspace total control.

This was caught when using spatch (aka. coccinelle)
to rewrite calls to ib_copy_{from,to}_udata().

Link: https://www.gitorious.org/opteya/ib-hw-nes-create-qp-null
Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Fix memory leak of recv context when driver fails to initialize.
Dennis Dalessandro [Mon, 17 Mar 2014 18:07:16 +0000 (14:07 -0400)]
IB/qib: Fix memory leak of recv context when driver fails to initialize.

In qib_create_ctxts() we allocate an array to hold recv contexts. Then attempt
to create data for those recv contexts. If that call to qib_create_ctxtdata()
fails then an error is returned but the previously allocated memory is not
freed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: fixup indentation in qib_ib_rcv()
Yann Droneaud [Mon, 10 Mar 2014 22:06:29 +0000 (23:06 +0100)]
IB/qib: fixup indentation in qib_ib_rcv()

Commit af061a644a0e4d4778 add some code in qib_ib_rcv() which
trigger a warning from coccicheck (coccinelle/spatch):

$ make C=2 CHECK=scripts/coccicheck drivers/infiniband/hw/qib/

  CHECK   drivers/infiniband/hw/qib/qib_verbs.c
drivers/infiniband/hw/qib/qib_verbs.c:679:5-32: code aligned with following code on line 681
  CC [M]  drivers/infiniband/hw/qib/qib_verbs.o

In fact, according to similar code in qib_kreceive(),
qib_ib_rcv() code is correct but improperly indented.

This patch fix indentation for the misaligned portion.

Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: infinipath@intel.com
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: cocci@systeme.lip6.fr
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: add missing braces in do_qib_user_sdma_queue_create()
Yann Droneaud [Mon, 10 Mar 2014 22:06:28 +0000 (23:06 +0100)]
IB/qib: add missing braces in do_qib_user_sdma_queue_create()

Commit c804f07248895ff9c moved qib_assign_ctxt() to
do_qib_user_sdma_queue_create() but dropped the braces
around the statements.

This was spotted by coccicheck (coccinelle/spatch):

$ make C=2 CHECK=scripts/coccicheck drivers/infiniband/hw/qib/

  CHECK   drivers/infiniband/hw/qib/qib_file_ops.c
drivers/infiniband/hw/qib/qib_file_ops.c:1583:2-23: code aligned with following code on line 1587

This patch adds braces back.

Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.com
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: infinipath@intel.com
Cc: Julia Lawall <julia.lawall@lip6.fr>
Cc: cocci@systeme.lip6.fr
Cc: stable@vger.kernel.org
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Modify software pma counters to use percpu variables
Mike Marciniszyn [Fri, 7 Mar 2014 13:40:55 +0000 (08:40 -0500)]
IB/qib: Modify software pma counters to use percpu variables

The counters, unicast_xmit, unicast_rcv, multicast_xmit, multicast_rcv
are now maintained as percpu variables.

The mad code is modified to add a z_ latch so that the percpu counters
monotonically increase with appropriate adjustments in the reset,
read logic to maintain the z_ latch.

This patch also corrects the fact the unitcast_xmit wasn't handled
at all for UC and RC QPs.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Add percpu counter replacing qib_devdata int_counter
Mike Marciniszyn [Fri, 7 Mar 2014 13:40:49 +0000 (08:40 -0500)]
IB/qib: Add percpu counter replacing qib_devdata int_counter

This patch replaces the dd->int_counter with a percpu counter.

The maintanance of qib_stats.sps_ints and int_counter are
combined into the new counter.

There are two new functions added to read the counter:
- qib_int_counter (for a particular qib_devdata)
- qib_sps_ints (for all HCAs)

A z_int_counter is added to allow the interrupt detection logic
to determine if interrupts have occured since z_int_counter
was "reset".

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Fix debugfs ordering issue with multiple HCAs
Mike Marciniszyn [Fri, 7 Mar 2014 13:32:31 +0000 (08:32 -0500)]
IB/qib: Fix debugfs ordering issue with multiple HCAs

The debugfs init code was incorrectly called before the idr mechanism
is used to get the unit number, so the dd->unit hasn't been
initialized.  This caused the unit relative directory creation to fail
after the first.

This patch moves the init for the debugfs stuff until after all of the
failures and after the unit number has been determined.

A bug in unwind code in qib_alloc_devdata() is also fixed.

Cc: <stable@vger.kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/ipath: Fix potential buffer overrun in sending diag packet routine
Dennis Dalessandro [Thu, 20 Feb 2014 16:02:53 +0000 (11:02 -0500)]
IB/ipath: Fix potential buffer overrun in sending diag packet routine

Guard against a potential buffer overrun.  The size to read from the
user is passed in, and due to the padding that needs to be taken into
account, as well as the place holder for the ICRC it is possible to
overflow the 32bit value which would cause more data to be copied from
user space than is allocated in the buffer.

Cc: <stable@vger.kernel.org>
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/qib: Fix potential buffer overrun in sending diag packet routine
Dennis Dalessandro [Thu, 20 Feb 2014 16:02:48 +0000 (11:02 -0500)]
IB/qib: Fix potential buffer overrun in sending diag packet routine

Guard against a potential buffer overrun.  Right now the qib driver is
protected by the fact that the data structure in question is only 16
bits.  Should that ever change the problem will be exposed. There is a
similar defect in the ipath driver and this brings the two code paths
into sync.

Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/nes: Fix for passing a valid QP pointer to the user space library
Tatyana Nikolova [Tue, 11 Mar 2014 19:20:17 +0000 (14:20 -0500)]
RDMA/nes: Fix for passing a valid QP pointer to the user space library

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoRDMA/nes: Fixes for IRD/ORD negotiation with MPA v2
Tatyana Nikolova [Tue, 11 Mar 2014 19:17:25 +0000 (14:17 -0500)]
RDMA/nes: Fixes for IRD/ORD negotiation with MPA v2

Fixes to enable the negotiation of the supported IRD/ORD sizes with
the peer when exchanging MPA v2 messages in connection establishment.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Expose support for signature MR feature
Sagi Grimberg [Sun, 23 Feb 2014 12:19:13 +0000 (14:19 +0200)]
IB/mlx5: Expose support for signature MR feature

Currently support only T10-DIF types of signature handover operations
(types 1|2|3).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Collect signature error completion
Sagi Grimberg [Sun, 23 Feb 2014 12:19:12 +0000 (14:19 +0200)]
IB/mlx5: Collect signature error completion

This commit takes care of the generated signature error CQE generated
by the HW (if happened).  The underlying mlx5 driver will handle
signature error completions and will mark the relevant memory region
as dirty.

Once the consumer gets the completion for the transaction, it must
check for signature errors on signature memory region using a new
lightweight verb ib_check_mr_status().

In case the user doesn't check for signature error (i.e. doesn't call
ib_check_mr_status() with status check IB_MR_CHECK_SIG_STATUS), the
memory region cannot be used for another signature operation
(REG_SIG_MR work request will fail).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Support IB_WR_REG_SIG_MR
Sagi Grimberg [Sun, 23 Feb 2014 12:19:11 +0000 (14:19 +0200)]
IB/mlx5: Support IB_WR_REG_SIG_MR

This patch implements IB_WR_REG_SIG_MR posted by the user.

Baisically this WR involves 3 WQEs in order to prepare and properly
register the signature layout:

1. post UMR WR to register the sig_mr in one of two possible ways:
    * In case the user registered a single MR for data so the UMR data segment
      consists of:
      - single klm (data MR) passed by the user
      - BSF with signature attributes requested by the user.
    * In case the user registered 2 MRs, one for data and one for protection,
      the UMR consists of:
      - strided block format which includes data and protection MRs and
        their repetitive block format.
      - BSF with signature attributes requested by the user.

2. post SET_PSV in order to set the memory domain initial
   signature parameters passed by the user.
   SET_PSV is not signaled and solicited CQE.

3. post SET_PSV in order to set the wire domain initial
   signature parameters passed by the user.
   SET_PSV is not signaled and solicited CQE.

* After this compound WR we place a small fence for next WR to come.

This patch also introduces some helper functions to set the BSF correctly
and determining the signature format selectors.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Keep mlx5 MRs in a radix tree under device
Sagi Grimberg [Sun, 23 Feb 2014 12:19:10 +0000 (14:19 +0200)]
IB/mlx5: Keep mlx5 MRs in a radix tree under device

This will be useful when processing signature errors on a specific
key.  The mlx5 driver will lookup the matching mlx5 memory region
structure and mark it as dirty (contains signature errors).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Remove MTT access mode from umr flags helper function
Sagi Grimberg [Sun, 23 Feb 2014 12:19:09 +0000 (14:19 +0200)]
IB/mlx5: Remove MTT access mode from umr flags helper function

get_umr_flags helper function might be used for types of access modes
other than ACCESS_MODE_MTT, such as ACCESS_MODE_KLM.  So remove it from
helper, and callers will add their own access mode flag.

This commit does not add/change functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Break up wqe handling into begin & finish routines
Sagi Grimberg [Sun, 23 Feb 2014 12:19:08 +0000 (14:19 +0200)]
IB/mlx5: Break up wqe handling into begin & finish routines

As a preliminary step for signature feature which will require posting
multiple (3) WQEs for a single WR, we break post_send routine WQE
indexing into begin and finish routines.

This patch does not change any functionality.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agoIB/mlx5: Initialize mlx5_ib_qp signature-related members
Sagi Grimberg [Sun, 23 Feb 2014 12:19:07 +0000 (14:19 +0200)]
IB/mlx5: Initialize mlx5_ib_qp signature-related members

If user requested signature enable we initialize relevant mlx5_ib_qp
members.  We mark the qp as sig_enable and we increase the effective
SQ size, but still limit the user max_send_wr to original size
computed.  We also allow the create_qp routine to accept sig_enable
create flag.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
10 years agomlx5: Implement create_mr and destroy_mr
Sagi Grimberg [Sun, 23 Feb 2014 12:19:06 +0000 (14:19 +0200)]
mlx5: Implement create_mr and destroy_mr

Support create_mr and destroy_mr verbs.  Creating ib_mr may be done
for either ib_mr that will register regular page lists like
alloc_fast_reg_mr routine, or indirect ib_mrs that can register other
(pre-registered) ib_mrs in an indirect manner.

In addition user may request signature enable, that will mean that the
created ib_mr may be attached with signature attributes (BSF, PSVs).

Currently we only allow direct/indirect registration modes.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>