pika memory leak of 4.0

Is this a regression?

Yes

Description

单实例，持续执行hset命令压测，2-4w的qps，一小时后内存占用100%

通过valgrind --leak-check=full --tool=memcheck --log-file=valgrind_output.txt ./pika程序 -c pika_9221.conf查看

Please provide a link to a minimal reproduction of the bug

No response

Screenshots or videos

No response

Please provide the version you discovered this bug in (check about page for version information)

Version： unstable分支 编译
OS： centos7

Anything else?

问题情况已同步少一 @w

Mar 20 '24 01:03 chenbt-hz

用当前 unstable 分支代码编译后再测试一遍

Mar 20 '24 04:03 AlexStocks

Bot detected the issue body's language is not English, translate it automatically.

Compile with the current unstable branch code and test again

Mar 20 '24 04:03 Issues-translate-bot

用当前 unstable 分支代码编译后再测试一遍

代码版本

unstable分支的测试结果

Mar 20 '24 08:03 chenbt-hz

我感觉，你的这个内存使用包含了pagecache，大部分内存都是pagecache占用的，下次在出现这种情况时，清空下pagecache看内存有没有掉下来： sudo echo 3 >> /proc/sys/vm/drop_caches @chenbt-hz cc @AlexStocks

Mar 31 '24 11:03 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

I feel that your memory usage includes pagecache, and most of the memory is occupied by pagecache. Next time this happens, clear pagecache and see if the memory has dropped: sudo echo 3 >> /proc/sys/vm/drop_caches @chenbt-hz cc @AlexStocks

Mar 31 '24 11:03 Issues-translate-bot

还有，看起来你们开了SWAP？，线上建议把swap关了。

Mar 31 '24 14:03 wangshao1

Bot detected the issue body's language is not English, translate it automatically.

Also, it looks like you have SWAP enabled? , it is recommended online to turn off swap.

Mar 31 '24 14:03 Issues-translate-bot

情况总结：

block_size较大时（例如64G），会占用比较大的内存
info 使用内存统计可能忽略了block_size
关闭block_size后测试，持续导入数据内存占用会缓慢增加，但是影响不大（约导入1T / 增加1G）但需要明确占用的原因

下一步是：

完善 info 使用内存统计
确定内存占用增长原因
较长时间的验证

Apr 09 '24 09:04 chenbt-hz

Bot detected the issue body's language is not English, translate it automatically.

Summary of the situation:

When block_size is large (for example, 64G), it will occupy a relatively large amount of memory.
info usage memory statistics may ignore block_size
Test after closing block_size. The memory usage of continued data import will increase slowly, but the impact is not significant (approximately 1T imported / 1G increased), but the reason for the occupancy needs to be clarified.

The next step is:

Improve info usage memory statistics
Determine the reason for the increase in memory usage
Longer verification

Apr 09 '24 09:04 Issues-translate-bot

少一：看下一个 RocksDB 上有多少 SST 文件，max files 要小于这个 sst 文件个数

Apr 26 '24 12:04 AlexStocks

Bot detected the issue body's language is not English, translate it automatically.

Less than one: Check how many SST files there are on the next RocksDB. max files should be less than this number of sst files.

Apr 26 '24 12:04 Issues-translate-bot

基本可以确定，目前 pika 不存在内存泄漏。

Apr 26 '24 12:04 AlexStocks

Bot detected the issue body's language is not English, translate it automatically.

It is basically certain that there is currently no memory leak in pika.

Apr 26 '24 12:04 Issues-translate-bot

在配置文件的中把 cache-index-and-filter-blocks: 设置为 yes 开启这个配置后, rocksdb会把 index和filter数据放到 block-cache 中, 如果block-cache 不够了, 会使用 LRU 淘汰数据, 所以这样内存占用大小基本可控

https://github.com/OpenAtomFoundation/pika/issues/1048

https://github.com/OpenAtomFoundation/pika/issues/1561#issuecomment-1575896838

可以参考这两个issue

Apr 26 '24 12:04 lqxhub

在配置文件的中把 cache-index-and-filter-blocks: 设置为 yes 开启这个配置后, rocksdb会把 index和filter数据放到 block-cache 中, 如果block-cache 不够了, 会使用 LRU 淘汰数据, 所以这样内存占用大小基本可控

#1048

#1561 (comment)

可以参考这两个issue

使用unstable分支编译后，hset写入3k长度的value，如下配置测试 db-instance-num: 6 block-cache: 128M， max-cache-file=5000, enable-partitioned-index-filters: yes， cache-index-and-filter-blocks=true, pin_l0_filter_and_index_blocks_in_cache = yes， share-block-cache: true

写入1.6T数据，内存基本稳定。tablereader曲线持续增长，暂未看到稳定不增长的情况，但是影响相对可控。

May 07 '24 02:05 chenbt-hz