rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

remove mutex for write

Open Little-Wallace opened this issue 5 years ago • 10 comments

Signed-off-by: Little-Wallace [email protected]

Summary

Cherry-pick https://github.com/facebook/rocksdb/pull/7516 This PR is to avoid write operation blocking on mutex when compaction or other method hold the mutex too long.

Little-Wallace avatar Oct 19 '20 05:10 Little-Wallace

Could you pass facebook's CI, or at least figure out why they failed? I see you have several builds timed out. That usually means certain cases are hanging (they don't output PASSED log nor error log).

tabokie avatar Jan 27 '22 08:01 tabokie

/test

Little-Wallace avatar Feb 01 '22 05:02 Little-Wallace

Could you pass facebook's CI, or at least figure out why they failed? I see you have several builds timed out. That usually means certain cases are hanging (they don't output PASSED log nor error log).

It seems to be a unstable test. These tests run a long time and may fail because of environment.

Little-Wallace avatar Feb 07 '22 03:02 Little-Wallace

@Little-Wallace I don't think so. There are some unstable ones, but they won't cause nearly half of the tests to fail (usually at most 2).

tabokie avatar Feb 07 '22 03:02 tabokie

@Little-Wallace I don't think so. There are some unstable ones, but they won't cause nearly half of the tests to fail (usually at most 2).

The tests did not fail. But the process was killed because out of time

Little-Wallace avatar Feb 11 '22 07:02 Little-Wallace

@Little-Wallace Some tests are hanging, mostly likely on some sync point conditions. It's better to figure out exactly what are those conditions, even though we might not need to fix them.

tabokie avatar Feb 11 '22 07:02 tabokie

Could you explain the bug you just fixed in upstream? So I know where the review process is flawed.

The function SyncClosedLogs must not hold mutex_ because this function may wait for other thread until they finish wal fsync. So we must unlock mutex_ before every where calling SyncClosedLogs

Little-Wallace avatar Feb 15 '22 12:02 Little-Wallace

I fix the problem of https://github.com/facebook/rocksdb/pull/7516 in this commit: https://github.com/facebook/rocksdb/pull/7516/commits/5fe832a2f1874ef5e9a8e2fc73fbc243642a1b22

There are some other bugs which are bout code only in main branch of facebook/rocksdb, so I did not port them here.

Little-Wallace avatar Feb 15 '22 12:02 Little-Wallace

I have finished the benchmark test. See details in https://docs.google.com/document/d/1f72BGmN1kREorkaCzhbsB0oocTP13IKJw4jy5mfm-TE/edit#heading=h.mtjjjfyw1ei I will run a long time stable test to make sure this PR would not cause other bug or error.

Little-Wallace avatar Feb 21 '22 09:02 Little-Wallace

rest LGTM. Can we add a test to cover the case of the mutex?

I thought there are enough tests in RocksDB to cover this condition? I have passed the CI of facebook to make sure that it is no problem

Little-Wallace avatar Mar 01 '22 06:03 Little-Wallace