EnhanceIO icon indicating copy to clipboard operation
EnhanceIO copied to clipboard

Problem with eio_clean_thr process

Open saba69 opened this issue 6 years ago • 4 comments

Hi,

I compiled and installed master version in kernel 4.14.34. When writing sequentially into the disk, the cache device gets full and cannot evict data from the cache. eio_clean_thr sticks in D state and the eviction (flushing/cleaning) process cannot continue even after stopping IO (sequential write). This is the stack trace of the process:

eio_clean_thr D 0 28023 2 0x80000080 Dec 7 12:08:46 srv01 kernel: Call Trace: Dec 7 12:08:46 srv01 kernel: ? __schedule+0x1ad/0x6a0 Dec 7 12:08:46 srv01 kernel: schedule+0x32/0x80 Dec 7 12:08:46 srv01 kernel: rwsem_down_write_failed+0x1fe/0x380 Dec 7 12:08:46 srv01 kernel: call_rwsem_down_write_failed+0x13/0x20 Dec 7 12:08:46 srv01 kernel: down_write+0x29/0x40 Dec 7 12:08:46 srv01 kernel: eio_clean_set+0x14c/0x9f0 [enhanceio] Dec 7 12:08:46 srv01 kernel: ? del_timer_sync+0x35/0x40 Dec 7 12:08:46 srv01 kernel: ? call_timer_fn+0x130/0x130 Dec 7 12:08:46 srv01 kernel: eio_clean_thread_proc+0x1bc/0x360 [enhanceio] Dec 7 12:08:46 srv01 kernel: ? __schedule+0x1b5/0x6a0 Dec 7 12:08:46 srv01 kernel: kthread+0xfc/0x130 Dec 7 12:08:46 srv01 kernel: ? eio_clean_all+0xd0/0xd0 [enhanceio] Dec 7 12:08:46 srv01 kernel: ? __kthread_parkme+0x70/0x70 Dec 7 12:08:46 srv01 kernel: ret_from_fork+0x35/0x40

Thanks, Saba

saba69 avatar Dec 07 '19 09:12 saba69

Please note that when eio is locked IO can be issued directly in both source and cache devices.

saba69 avatar Dec 07 '19 10:12 saba69

I just compiled and installed master version in kernel 4.14.158. I still face the same bug in this kernel. Here is the stack trace: enhanceio_lru: Initialized 32639 sets in LRU Dec 7 14:48:09 srv01 kernel: sysrq: SysRq : Show Blocked State Dec 7 14:48:09 srv01 kernel: task PC stack pid father Dec 7 14:48:09 srv01 kernel: eio_clean_threa D 0 3640 2 0x80000080 Dec 7 14:48:09 srv01 kernel: Call Trace: Dec 7 14:48:09 srv01 kernel: ? schedule+0x1b0/0x6b0 Dec 7 14:48:09 srv01 kernel: ? switch_to_asm+0x41/0x70 Dec 7 14:48:09 srv01 kernel: ? switch_to_asm+0x35/0x70 Dec 7 14:48:09 srv01 kernel: schedule+0x32/0x80 Dec 7 14:48:09 srv01 kernel: rwsem_down_write_failed+0x206/0x380 Dec 7 14:48:09 srv01 kernel: ? switch_to_asm+0x41/0x70 Dec 7 14:48:09 srv01 kernel: ? switch_to_asm+0x35/0x70 Dec 7 14:48:09 srv01 kernel: call_rwsem_down_write_failed+0x13/0x20 Dec 7 14:48:09 srv01 kernel: down_write+0x29/0x40 Dec 7 14:48:09 srv01 kernel: eio_clean_set+0x14c/0x980 [enhanceio] Dec 7 14:48:09 srv01 kernel: ? del_timer_sync+0x35/0x40 Dec 7 14:48:09 srv01 kernel: ? call_timer_fn+0x140/0x140 Dec 7 14:48:09 srv01 kernel: eio_clean_thread_proc+0x1bc/0x360 [enhanceio] Dec 7 14:48:09 srv01 kernel: ? schedule+0x1b8/0x6b0 Dec 7 14:48:09 srv01 kernel: kthread+0xff/0x140 Dec 7 14:48:09 srv01 kernel: ? eio_clean_all+0xd0/0xd0 [enhanceio] Dec 7 14:48:09 srv01 kernel: ? __kthread_parkme+0x90/0x90 Dec 7 14:48:09 srv01 kernel: ret_from_fork+0x35/0x40

saba69 avatar Dec 07 '19 11:12 saba69

Master version flushes cache without any problem in kernel 3.10.0. But it takes too long to fill the cache (compared to kernel 4.14) with the same sequential write workload.

saba69 avatar Dec 08 '19 05:12 saba69

I tested the master version on kernel 5.4.2 and the flush process works correctly.

saba69 avatar Dec 08 '19 13:12 saba69