OSDN Git Service

habanalabs: bypass reset for continuous h/w error event
authorBharat Jauhari <bjauhari@habana.ai>
Mon, 21 Jun 2021 06:57:19 +0000 (09:57 +0300)
committerOded Gabbay <ogabbay@kernel.org>
Mon, 18 Oct 2021 09:05:47 +0000 (12:05 +0300)
commit10cab81d1cf92b1b62234540efba34ccaf7079e8
tree500c27862cc85ca646e1856e7a71390f8fccb965
parentf05d17b226dbb5e2f21b724918b263cba57f2ad8
habanalabs: bypass reset for continuous h/w error event

There may be a situation where drivers receives continuous fatal H/W
error events from FW immediately post reset cycle.
This may be due to some fault on the silicon itself.
In such case its better to bypass reset cycle so we won't be stuck in
endless loop of resets.

This commit bypasses reset request in case driver received two back to
back FW fatal error before first occurrence of heartbeat event.

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
drivers/misc/habanalabs/common/device.c
drivers/misc/habanalabs/common/habanalabs.h
drivers/misc/habanalabs/common/habanalabs_drv.c
drivers/misc/habanalabs/gaudi/gaudi.c
drivers/misc/habanalabs/goya/goya.c