milvus
milvus copied to clipboard
[Bug]: v2.4.0 datanode 内存使用过高
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version: v2.4.0
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): kafka
- SDK version(e.g. pymilvus v2.0.0rc2): 2.7
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: 544c /4291.6 G at least
- GPU: 0
- Others: datanode
Current Behavior
参考sizing tools 分配Data Node , 2 core 8 GB x 2pods , 实际运行出现OOM , 扩容后内存占用达40G
Expected Behavior
参考sizing tools 分配Data Node , 2 core 8 GB x 2pods , 实际运行出现OOM , 扩容后内存占用达40G
Steps To Reproduce
参考sizing tools 分配Data Node , 2 core 8 GB x 2pods , 实际运行出现OOM , 扩容后内存占用达40G
Milvus Log
No response
Anything else?
No response
The title and description of this issue contains Chinese. Please use English to describe your issue.
Referring to the Sizing Tools, allocate Data Nodes with 2 cores of 8 GB x 2 pods. However, during actual operation, the Data Nodes was an OOM, and after expansion, the memory usage reached 40G.
@yesyue please share more info about how you using milvus, e.g. what kinds of requests did you call to milvus, how many, and how frequency of them? also please help all the milvus pods logs for invesgitaion.
/assign @yesyue /unassign
100 Million/day entites write to milvus
100 Million/day entites write to milvus
after I inserted 10M entites total, then milvus docker stop and crash. I use IVF_SQ8 index, installed milvus with gpu. I use batch insert 10000 (only insert if enough 10000 entities.
after crash I can't connect to connection again and can't use anything. Any solution?
- seems that flush can not catch up the read.
- how many partitions do you have? if you have many partitions or collections, the flush and memory consumption will be larger than estimation.
- there is bunch of configs to tune, like concurrent flush number -> dataNode.dataSync.maxParallelSyncMgrTasks (for 2.4) memory used for growing segment
100 Million/day entites write to milvus
after I inserted 10M entites total, then milvus docker stop and crash. I use IVF_SQ8 index, installed milvus with gpu. I use batch insert 10000 (only insert if enough 10000 entities.
after crash I can't connect to connection again and can't use anything. Any solution?
how much gpu memory do you have? please open another issue with detailed logs so we can help
1.could you offer log for datanode? 2. it would be great if you have a datanode pprof, so you know which part takes of your memory. Most likely it's insert buffer takes the memory and you can tune the flush parameter
I saw you in many issues and we'd like to offer help. feel free to contact me at [email protected] if necessary
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.