pika pika startup time is too long 启动时间过长

Is this a regression?

Yes

Description

215g数据启动时间大约需要15分钟相同配置使用老版本就没有这个问题。

暂时没定位到具体原因

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

Version： 4.0.1-alpha以后的版本

Anything else?

No response

Feb 28 '25 02:02 chenbt-hz

有一个max-total-wal-size的参数用来设置单个rocksdb的wal的最大值。使用复杂数据类型，元信息和data信息存放在不同Columnfamily，元信息数据量小，可能一直没有flush，导致的wal就会攒的比较多。后续pika这边需要支持并发地open rocksdb，现在顺序open时间也长。

Feb 28 '25 08:02 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

There is a max-total-wal-size parameter to set the maximum value of wal for a single rocksdb. Using complex data types, meta information and data information are stored in different Columnfamily. The amount of meta information data is small, and there may be no flush, which will cause more wals to be saved. In the future, PIKA needs to support concurrent open rocksdb, and now the order of opening is longer.

Feb 28 '25 08:02 Issues-translate-bot

有一个max-total-wal-size的参数用来设置单个rocksdb的wal的最大值。使用复杂数据类型，元信息和data信息存放在不同Columnfamily，元信息数据量小，可能一直没有flush，导致的wal就会攒的比较多。后续pika这边需要支持并发地open rocksdb，现在顺序open时间也长。

这个配置就是默认的1G+6db。并且相同配置我换个老版本的包启动就挺快的

Feb 28 '25 09:02 chenbt-hz

Bot detected the issue body's language is not English, translate it automatically.

There is a max-total-wal-size parameter to set the maximum value of wal for a single rocksdb. Using complex data types, meta information and data information are stored in different Columnfamily. The amount of meta information data is small, and there may be no flush, which will cause more wals to be saved. In the future, PIKA needs to support concurrent open rocksdb, and now the order of opening is longer.

This configuration is the default 1G+6db. And I will start with the same configuration quickly if I change the old version of the package.

Feb 28 '25 09:02 Issues-translate-bot

启动日志，看着是某些情况下启动会卡在TimerTaskThread之前

I20250325 16:22:00.708266 17151 pika_db.cc:50] db0 DB Success
I20250325 16:22:00.708424 17678 pika_cache_load_thread.cc:181] PikaCacheLoadThread::ThreadMain Start
I20250325 17:37:29.257035 17151 net_util.cc:121] TimerTaskThread Starting...
I20250325 17:37:29.258409 17151 pika_server.cc:214] Pika Server going to start
I20250325 17:37:29.258432 17151 rsync_server.cc:48] start RsyncServer ...
I20250325 17:37:29.258682 17151 rsync_server.cc:60] RsyncServer started ...```

Apr 18 '25 06:04 chenbt-hz

16:22 ~ 17:37 这个区间内，pika这段时间没有日志输出，每个rocksdb也没有日志输出吗？我总感觉应该还是在rocksdb层面。

Apr 18 '25 08:04 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

16:22 ~ 17:37 In this interval, pika has no log output during this period, and does each rocksdb have no log output either? I always feel that it should be at the rocksdb level.

Apr 18 '25 08:04 Issues-translate-bot

16:22 ~ 17:37 这个区间内，pika这段时间没有日志输出，每个rocksdb也没有日志输出吗？我总感觉应该还是在rocksdb层面。

rocksdb有很多log 我看基本上读取压缩之类的日志，这俩应该没关系吧。后台进程一直运行，实际服务端口都没起来。刚才debug看了会阻塞在这里：pika_cache_load_thread.cc:189 loadkeys_cond_.wait(lq)


(gdb) n
I20250418 16:25:26.120790 26679 pika_cache_load_thread.cc:181] PikaCacheLoadThread::ThreadMain Start
183	  while (!should_exit_) {
(gdb) n
184	    std::deque<std::tuple<const char, std::string, const std::shared_ptr<DB>>> load_keys;
(gdb) n
186	      std::unique_lock lq(loadkeys_mutex_);
(gdb) n
187	      waitting_load_keys_num_ = loadkeys_queue_.size();
(gdb) p lq
$1 = {_M_device = 0x7ffff6f97618, _M_owns = false}
(gdb) p loadkeys_queue_.size()
Cannot evaluate function -- may be inlined
(gdb) p lq.size()
Couldn't find method std::unique_lock<std::mutex>::size
(gdb) n
188	      while (!should_exit_ && loadkeys_queue_.size() <= 0) {
(gdb) call loadkeys_cond_.notify_all()
(gdb) n
PikaCacheLoadThread::ThreadMain() () at /pika/src/pika_cache_load_thread.cc:188
188	      while (!should_exit_ && loadkeys_queue_.size() <= 0) {
(gdb) n

Breakpoint 2, PikaCacheLoadThread::ThreadMain() () at /pika/src/pika_cache_load_thread.cc:189
189	        loadkeys_cond_.wait(lq);
(gdb) p lq
$2 = {_M_device = 0x7ffff6f97618, _M_owns = true}
(gdb) n

Apr 18 '25 08:04 chenbt-hz

这个是异步读rocksdb里的数据到rediscache里，没有数据需要读的时候应该就是wait在条件变量上。你现场还在的话，可以看下启动线程阻塞在啥地方了。 rocksdb的日志，帮忙grep一下 Recovering from manifest和Recovered from manifest这两条日志打印的时间。

Apr 18 '25 08:04 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

This is to read the data in rocksdb into rediscache asynchronously. When there is no data to be read, it should be waiting on the condition variable. If you are still here on the scene, you can see where the startup thread is blocking. Rocksdb logs, help grep Recovering from manifest and Recovered from manifest are the time when the two logs are printed.

Apr 18 '25 08:04 Issues-translate-bot

少一： https://github.com/OpenAtomFoundation/pikiwidb/pull/2888 导致阻塞 DB::WashData() () at /pika/src/pika_db.cc:66

Thread 1 (Thread 0x7ffff7fe9d80 (LWP 20433)):
#0  0x00007ffff76c7fc3 in pread64 () from /usr/lib64/libpthread.so.0
#1  0x0000000000b84675 in rocksdb::PosixRandomAccessFile::Read(unsigned long, unsigned long, rocksdb::IOOptions const&, rocksdb::Slice*, char*, rocksdb::IODebugContext*) const ()
#2  0x0000000000a0c209 in rocksdb::RandomAccessFileReader::Read(rocksdb::IOOptions const&, unsigned long, unsigned long, rocksdb::Slice*, char*, std::unique_ptr<char [], std::default_delete<char []> >*) const ()
#3  0x0000000000ad2289 in rocksdb::BlockFetcher::ReadBlockContents() ()
#4  0x0000000000aa802a in std::enable_if<(rocksdb::Block_kData::kCacheEntryRole==((rocksdb::CacheEntryRole)13))||(true), rocksdb::Status>::type rocksdb::BlockBasedTable::MaybeReadBlockAndLoadToCache<rocksdb::Block_kData>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, bool, rocksdb::CachableEntry<rocksdb::Block_kData>*, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::BlockContents*, bool, bool) const ()
#5  0x0000000000aa8507 in std::enable_if<(rocksdb::Block_kData::kCacheEntryRole==((rocksdb::CacheEntryRole)13))||(true), rocksdb::Status>::type rocksdb::BlockBasedTable::RetrieveBlock<rocksdb::Block_kData>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::CachableEntry<rocksdb::Block_kData>*, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*,---Type <return> to continue, or q <return> to quit---
 bool, bool, bool, bool) const ()
#6  0x0000000000c6a25a in rocksdb::DataBlockIter* rocksdb::BlockBasedTable::NewDataBlockIterator<rocksdb::DataBlockIter>(rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::DataBlockIter*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::FilePrefetchBuffer*, bool, bool, rocksdb::Status&, bool) const ()
#7  0x0000000000c610fa in rocksdb::BlockBasedTableIterator::InitDataBlock() ()
#8  0x0000000000c63421 in rocksdb::BlockBasedTableIterator::FindBlockForward() ()
#9  0x0000000000c64168 in rocksdb::BlockBasedTableIterator::NextAndGetResult(rocksdb::IterateResult*) ()
#10 0x00000000009898f4 in rocksdb::(anonymous namespace)::LevelIterator::NextAndGetResult(rocksdb::IterateResult*) ()
#11 0x0000000000ae7895 in rocksdb::MergingIterator::NextAndGetResult(rocksdb::IterateResult*) ()
#12 0x00000000008d371a in rocksdb::DBIter::Next() ()
#13 0x0000000000603491 in DB::WashData() () at /pika/src/pika_db.cc:66
#14 0x00000000004bf386 in main ()
    at /opt/rh/devtoolset-9/root/usr/include/c++/9/bits/shared_ptr_base.h:1020
#15 0x00007ffff730d555 in __libc_start_main () from /usr/lib64/libc.so.6
#16 0x0000000000550f4e in _start () at /opt/rh/devtoolset-9/root/usr/include/c++/9/ext/new_allocator.h:89

Apr 18 '25 09:04 chenbt-hz

这个是为了规避4.0.0版本bug写入的脏数据而加的洗数据策略的执行时间过长导致的，洗数据时间与string数据量相关。关联pr： https://github.com/OpenAtomFoundation/pikiwidb/pull/2888 临时的解决办法是配置文件里把wash_data设置为false，不过需要确认之前调用过incr相关接口。

Apr 18 '25 10:04 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

This is caused by the execution time of the data washing strategy added to avoid dirty data written by the bug in the 4.0.0 version, and the data washing time is related to the amount of string data. Related pr: https://github.com/OpenAtomFoundation/pikiwidb/pull/2888 The temporary solution is to set wash_data to false in the configuration file, but it is necessary to confirm that the incr-related interface has been called before.

Apr 18 '25 10:04 Issues-translate-bot

关闭配置中的wash-data

Jun 19 '25 02:06 chenbt-hz

Bot detected the issue body's language is not English, translate it automatically.

Turn off wash-data in configuration

Jun 19 '25 02:06 Issues-translate-bot