memory leak of 4.0
Is this a regression?
Yes
Description
单实例,持续执行hset命令压测,2-4w的qps,一小时后内存占用100%
通过valgrind --leak-check=full --tool=memcheck --log-file=valgrind_output.txt ./pika程序 -c pika_9221.conf查看
Please provide a link to a minimal reproduction of the bug
No response
Screenshots or videos
No response
Please provide the version you discovered this bug in (check about page for version information)
Version: unstable分支 编译
OS: centos7
Anything else?
问题情况已同步少一 @w
用当前 unstable 分支代码编译后再测试一遍
Bot detected the issue body's language is not English, translate it automatically.
Compile with the current unstable branch code and test again
用当前 unstable 分支代码编译后再测试一遍
代码版本
unstable分支的测试结果
我感觉,你的这个内存使用包含了pagecache,大部分内存都是pagecache占用的,下次在出现这种情况时,清空下pagecache看内存有没有掉下来: sudo echo 3 >> /proc/sys/vm/drop_caches @chenbt-hz cc @AlexStocks
Bot detected the issue body's language is not English, translate it automatically.
I feel that your memory usage includes pagecache, and most of the memory is occupied by pagecache. Next time this happens, clear pagecache and see if the memory has dropped: sudo echo 3 >> /proc/sys/vm/drop_caches @chenbt-hz cc @AlexStocks
还有,看起来你们开了SWAP?,线上建议把swap关了。
Bot detected the issue body's language is not English, translate it automatically.
Also, it looks like you have SWAP enabled? , it is recommended online to turn off swap.
情况总结:
- block_size较大时(例如64G),会占用比较大的内存
- info 使用内存统计可能忽略了block_size
- 关闭block_size后测试,持续导入数据内存占用会缓慢增加,但是影响不大(约导入1T / 增加1G)但需要明确占用的原因
下一步是:
- 完善 info 使用内存统计
- 确定内存占用增长原因
- 较长时间的验证
Bot detected the issue body's language is not English, translate it automatically.
Summary of the situation:
- When block_size is large (for example, 64G), it will occupy a relatively large amount of memory.
- info usage memory statistics may ignore block_size
- Test after closing block_size. The memory usage of continued data import will increase slowly, but the impact is not significant (approximately 1T imported / 1G increased), but the reason for the occupancy needs to be clarified.
The next step is:
- Improve info usage memory statistics
- Determine the reason for the increase in memory usage
- Longer verification
少一:看下一个 RocksDB 上有多少 SST 文件,max files 要小于 这个 sst 文件个数
Bot detected the issue body's language is not English, translate it automatically.
Less than one: Check how many SST files there are on the next RocksDB. max files should be less than this number of sst files.
基本可以确定,目前 pika 不存在内存泄漏。
Bot detected the issue body's language is not English, translate it automatically.
It is basically certain that there is currently no memory leak in pika.
在 配置文件的中 把 cache-index-and-filter-blocks: 设置为 yes 开启这个 配置后, rocksdb会把 index和filter数据放到 block-cache 中, 如果block-cache 不够了, 会使用 LRU 淘汰数据, 所以 这样 内存占用大小基本可控
https://github.com/OpenAtomFoundation/pika/issues/1048
https://github.com/OpenAtomFoundation/pika/issues/1561#issuecomment-1575896838
可以参考 这两个issue
在 配置文件的中 把
cache-index-and-filter-blocks:设置为yes开启这个 配置后, rocksdb会把 index和filter数据放到block-cache中, 如果block-cache 不够了, 会使用 LRU 淘汰数据, 所以 这样 内存占用大小基本可控#1048
可以参考 这两个issue
使用unstable分支编译后,hset写入3k长度的value,如下配置测试 db-instance-num: 6 block-cache: 128M, max-cache-file=5000, enable-partitioned-index-filters: yes, cache-index-and-filter-blocks=true, pin_l0_filter_and_index_blocks_in_cache = yes, share-block-cache: true
写入1.6T数据,内存基本稳定。tablereader曲线持续增长,暂未看到稳定不增长的情况,但是影响相对可控。