OSDN Git Service

nvme: fix use after free when disconnecting a reconnecting ctrl
authorRuozhu Li <liruozhu@huawei.com>
Thu, 4 Nov 2021 07:13:32 +0000 (15:13 +0800)
committerChristoph Hellwig <hch@lst.de>
Tue, 7 Dec 2021 17:21:16 +0000 (18:21 +0100)
commit8b77fa6fdce0fc7147bab91b1011048758290ca4
tree3fdbaf4e2f876d1a0856e016231844a8e025f1c7
parentc7c15ae3dc50c0ab46c5cbbf8d2f3d3307e51f37
nvme: fix use after free when disconnecting a reconnecting ctrl

A crash happens when trying to disconnect a reconnecting ctrl:

 1) The network was cut off when the connection was just established,
    scan work hang there waiting for some IOs complete.  Those I/Os were
    retried because we return BLK_STS_RESOURCE to blk in reconnecting.
 2) After a while, I tried to disconnect this connection.  This
    procedure also hangs because it tried to obtain ctrl->scan_lock.
    It should be noted that now we have switched the controller state
    to NVME_CTRL_DELETING.
 3) In nvme_check_ready(), we always return true when ctrl->state is
    NVME_CTRL_DELETING, so those retrying I/Os were issued to the bottom
    device which was already freed.

To fix this, when ctrl->state is NVME_CTRL_DELETING, issue cmd to bottom
device only when queue state is live.  If not, return host path error to
the block layer

Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/host/core.c
drivers/nvme/host/nvme.h