yugabyte-db icon indicating copy to clipboard operation
yugabyte-db copied to clipboard

[DocDB] Tablet Bootstrap Hangs Due to Truncate

Open yusong-yan opened this issue 1 year ago • 1 comments

Jira Link: DB-12175

Description

We observed tablet bootstrap get stuck during truncate.

@     0x7f926b3be3b7  __pthread_cond_timedwait
    @     0x562ec520cd81  std::__1::condition_variable::wait_until<>()
    @     0x562ec53dc557  std::__1::this_thread::sleep_until<>()
    @     0x562ec687791a  yb::RWOperationCounter::DisableAndWaitForOps()
    @     0x562ec6878eef  yb::ScopedRWOperationPause::ScopedRWOperationPause()
    @     0x562ec6200579  yb::tablet::Tablet::PauseReadWriteOperations()
    @     0x562ec61ff98b  yb::tablet::Tablet::StartShutdownRocksDBs()
    @     0x562ec6226adc  yb::tablet::Tablet::Truncate()
    @     0x562ec624c9f7  yb::tablet::TabletBootstrap::PlayAnyRequest()
    @     0x562ec624a93d  yb::tablet::TabletBootstrap::ApplyCommittedPendingReplicates()
    @     0x562ec62448e5  yb::tablet::TabletBootstrap::PlaySegments()
    @     0x562ec6238087  yb::tablet::TabletBootstrap::Bootstrap()
    @     0x562ec62500ac  yb::tablet::BootstrapTablet()
    @     0x562ec64fa5b6  yb::tserver::TSTabletManager::OpenTablet()
    @     0x562ec68b3598  yb::ThreadPool::DispatchThread()
    @     0x562ec68af753  yb::thread::SuperviseThread()

Almost likely, it's waiting for TransactionLoader::Executor to release its ScopedRWOperation, which only happens after bootstrap completes.

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • [X] I confirm this issue does not contain any sensitive information.

yusong-yan avatar Jul 18 '24 15:07 yusong-yan

Here is where executor is destroy.

void LoadFinished(Status load_status) EXCLUDES(status_resolvers_mutex_) override {
 ...
 start_latch_.Wait();

start_latch_.Wait() is released after tablet bootstrap finish.

yusong-yan avatar Jul 18 '24 15:07 yusong-yan