OSDN Git Service

drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran
authorMukul Joshi <mukul.joshi@amd.com>
Tue, 21 Sep 2021 00:48:23 +0000 (20:48 -0400)
committerAlex Deucher <alexander.deucher@amd.com>
Wed, 13 Oct 2021 18:14:48 +0000 (14:14 -0400)
commit91a1a52d03aa0f1f2b51c7df8a7bf437e906e29f
tree76cd74fbfdf110ea66892d85b71feb2e8c767d21
parenta4967a1ebf1b9e68cc99ab666ece65733fffcac6
drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran

During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c