ROCK-Kernel-Driver icon indicating copy to clipboard operation
ROCK-Kernel-Driver copied to clipboard

Repeated ASSERT cause kernel stacktraces in dc_link.c because of write_i2c_default_retimer_setting

Open drwetter opened this issue 5 years ago • 0 comments

Hi,

I am running a vendor kernel >5.1.5 and I have been noticing frequent kernel stack traces which accompanied me since I have this machine. It's caused by the AMD GPU driver, a 1:1 comparison with the code here shows to me no relevant change which could fix this.

prompt> dmesg | grep WARNING | tail -20                                                                                                                      
[353280.920265] WARNING: CPU: 5 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1843 write_i2c_default_retimer_setting+0x1b9/0x320 [amdgpu]
[353280.921511] WARNING: CPU: 5 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1855 write_i2c_default_retimer_setting+0x208/0x320 [amdgpu]
[353283.660051] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1795 write_i2c_default_retimer_setting+0x64/0x320 [amdgpu]
[353283.660945] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1807 write_i2c_default_retimer_setting+0xb6/0x320 [amdgpu]
[353283.661612] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1819 write_i2c_default_retimer_setting+0x110/0x320 [amdgpu]
[353283.662277] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1831 write_i2c_default_retimer_setting+0x160/0x320 [amdgpu]
[353283.662944] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1843 write_i2c_default_retimer_setting+0x1b9/0x320 [amdgpu]
[353283.663623] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1855 write_i2c_default_retimer_setting+0x208/0x320 [amdgpu]
[362502.626738] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1795 write_i2c_default_retimer_setting+0x64/0x320 [amdgpu]
[362502.627621] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1807 write_i2c_default_retimer_setting+0xb6/0x320 [amdgpu]
[362502.628505] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1819 write_i2c_default_retimer_setting+0x110/0x320 [amdgpu]
[362502.629308] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1831 write_i2c_default_retimer_setting+0x160/0x320 [amdgpu]
[362502.630076] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1843 write_i2c_default_retimer_setting+0x1b9/0x320 [amdgpu]
[362502.630941] WARNING: CPU: 4 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1855 write_i2c_default_retimer_setting+0x208/0x320 [amdgpu]
[362505.385466] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1795 write_i2c_default_retimer_setting+0x64/0x320 [amdgpu]
[362505.386142] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1807 write_i2c_default_retimer_setting+0xb6/0x320 [amdgpu]
[362505.386861] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1819 write_i2c_default_retimer_setting+0x110/0x320 [amdgpu]
[362505.387521] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1831 write_i2c_default_retimer_setting+0x160/0x320 [amdgpu]
[362505.388163] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1843 write_i2c_default_retimer_setting+0x1b9/0x320 [amdgpu]
[362505.388809] WARNING: CPU: 3 PID: 2739 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:1855 write_i2c_default_retimer_setting+0x208/0x320 [amdgpu]
prompt> 

It looks to me it's the instance of several ASSERT statements in https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/master/drivers/gpu/drm/amd/display/dc/core/dc_link.c / write_i2c_default_retimer_setting()

if (!i2c_success)
			/* Write failure */
ASSERT(i2c_success);

Two questions:

  • why is every call causing a stack trace? I am not a kernel hacker but not a noob. Also to me this appears kind of confusing,
  • what exactly fails here when accessing the i2c bus, is there a module not loaded?
prompt# lsmod | grep i2c
i2c_algo_bit           16384  2 igb,amdgpu
i2c_piix4              28672  0
prompt# 

Cheers, Dirk

drwetter avatar Jun 20 '19 07:06 drwetter