OSDN Git Service

drm/amdgpu: add debugfs node to toggle ras error cnt harvest
authorGuchun Chen <guchun.chen@amd.com>
Tue, 4 Aug 2020 07:05:01 +0000 (15:05 +0800)
committerAlex Deucher <alexander.deucher@amd.com>
Fri, 14 Aug 2020 20:12:47 +0000 (16:12 -0400)
Before ras recovery is issued, user could operate this debugfs
node to enable/disable the harvest of all RAS IPs' ras error
count registers, which will help keep hardware's registers'
status instead of cleaning up them.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c

index fbe464c..2d1fad1 100644 (file)
@@ -1215,6 +1215,13 @@ static void amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev)
         */
        debugfs_create_bool("auto_reboot", S_IWUGO | S_IRUGO, con->dir,
                                &con->reboot);
+
+       /*
+        * User could set this not to clean up hardware's error count register
+        * of RAS IPs during ras recovery.
+        */
+       debugfs_create_bool("disable_ras_err_cnt_harvest", 0644,
+                       con->dir, &con->disable_ras_err_cnt_harvest);
 }
 
 void amdgpu_ras_debugfs_create(struct amdgpu_device *adev,