zfs icon indicating copy to clipboard operation
zfs copied to clipboard

deadlock between spa_errlog_lock and dp_config_rwlock

Open ahrens opened this issue 3 years ago • 4 comments

There is a lock order inversion deadlock between spa_errlog_lock and dp_config_rwlock:

A thread in spa_delete_dataset_errlog() is running from a sync task. It is holding the dp_config_rwlock for writer (see dsl_sync_task_sync()), and waiting for the spa_errlog_lock.

A thread in dsl_pool_config_enter() is holding the spa_errlog_lock (see spa_get_errlog_size()) and waiting for the dp_config_rwlock (as reader).

Note that this was introduced by 0409d3327371cef8a8c5886cb7530ded6f5f1091

It may be possible to address this by defining the lock ordering to be dp_config_rwlock first, then spa_errlog_lock / spa_errlist_lock. spa_get_errlog() and spa_get_errlog_size() can acquire the locks in this order, and then process_error_block() and get_head_and_birth_txg() can verify that the dp_config_rwlock is already held.

@gamanakis any thoughts on this?

ahrens avatar Nov 29 '22 23:11 ahrens

Thank you for catching this. Your suggestion makes sense, I will take a look.

gamanakis avatar Nov 30 '22 18:11 gamanakis

There is an additional inversion here: spa_errlog_sync() running from spa_sync_iterate_to_convergence() acquires the spa_errlog_lock and then waits for the dp_config_rwlock in get_head_and_birth_txg().

I think it's reasonable to have the same lock order in all cases.

gamanakis avatar Dec 01 '22 22:12 gamanakis

Nice catch. @gamanakis are you planning to implement the fix for this, and if so do have a rough timeline? If not I will take it.

ahrens avatar Dec 02 '22 17:12 ahrens

@ahrens please feel free to go ahead.

gamanakis avatar Dec 02 '22 19:12 gamanakis