milvus
milvus copied to clipboard
[Bug]: [benchmark] milvus insert data datanode memory rise
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:2.2.0-20230410-d845175f
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
1b data continuous search test scenario, before inserting 1b data datanode memory use no more than 1.4G, but now the same insert frequency, datanode memory use more than 3G, and is still growing.
image: 2.1.0-20220726-1b33c731
datanode memory usage (709m + 733m combined does not exceed 1.4G) :
image: 2.2.0-20230410-d845175f
qtp-1b-test-lbcyr-etcd-0 1/1 Running 0 6m23s 10.104.5.110 4am-node12 <none> <none>
qtp-1b-test-lbcyr-etcd-1 1/1 Running 0 6m22s 10.104.6.235 4am-node13 <none> <none>
qtp-1b-test-lbcyr-etcd-2 1/1 Running 0 6m22s 10.104.4.48 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-datacoord-5b489d5f57-dnvw7 1/1 Running 0 6m23s 10.104.4.31 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-datanode-5888986546-9kkjt 1/1 Running 1 (2m22s ago) 6m23s 10.104.6.228 4am-node13 <none> <none>
qtp-1b-test-lbcyr-milvus-indexcoord-6bd8dd4d7-jrjj2 1/1 Running 1 (2m22s ago) 6m23s 10.104.4.33 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-indexnode-58b84db4c8-ctldx 1/1 Running 0 6m23s 10.104.9.122 4am-node14 <none> <none>
qtp-1b-test-lbcyr-milvus-proxy-7b9c67b545-st8ch 1/1 Running 1 (2m22s ago) 6m23s 10.104.5.101 4am-node12 <none> <none>
qtp-1b-test-lbcyr-milvus-querycoord-695fb8f5b-nfmws 1/1 Running 1 (2m22s ago) 6m23s 10.104.4.35 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-kn8jz 1/1 Running 0 6m23s 10.104.6.227 4am-node13 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-l24r2 1/1 Running 0 6m23s 10.104.9.119 4am-node14 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-p5cbs 1/1 Running 0 6m23s 10.104.4.37 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-p5rjt 1/1 Running 0 6m23s 10.104.5.100 4am-node12 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-vr7vc 1/1 Running 0 6m23s 10.104.4.40 4am-node11 <none> <none>
qtp-1b-test-lbcyr-milvus-querynode-8574948fc4-xf8cn 1/1 Running 0 6m23s 10.104.1.120 4am-node10 <none> <none>
qtp-1b-test-lbcyr-milvus-rootcoord-5fb7645c68-mnpcb 1/1 Running 1 (2m22s ago) 6m23s 10.104.4.34 4am-node11 <none> <none>
qtp-1b-test-lbcyr-minio-0 1/1 Running 0 6m23s 10.104.5.109 4am-node12 <none> <none>
qtp-1b-test-lbcyr-minio-1 1/1 Running 0 6m23s 10.104.6.234 4am-node13 <none> <none>
qtp-1b-test-lbcyr-minio-2 1/1 Running 0 6m22s 10.104.4.46 4am-node11 <none> <none>
qtp-1b-test-lbcyr-minio-3 1/1 Running 0 6m22s 10.104.9.124 4am-node14 <none> <none>
qtp-1b-test-lbcyr-pulsar-bookie-0 1/1 Running 0 6m23s 10.104.5.107 4am-node12 <none> <none>
qtp-1b-test-lbcyr-pulsar-bookie-1 1/1 Running 0 6m23s 10.104.6.233 4am-node13 <none> <none>
qtp-1b-test-lbcyr-pulsar-bookie-2 1/1 Running 0 6m22s 10.104.4.45 4am-node11 <none> <none>
qtp-1b-test-lbcyr-pulsar-bookie-init-qk8wj 0/1 Completed 0 6m23s 10.104.4.38 4am-node11 <none> <none>
qtp-1b-test-lbcyr-pulsar-broker-0 1/1 Running 0 6m23s 10.104.4.36 4am-node11 <none> <none>
qtp-1b-test-lbcyr-pulsar-proxy-0 1/1 Running 0 6m23s 10.104.5.99 4am-node12 <none> <none>
qtp-1b-test-lbcyr-pulsar-pulsar-init-q6n6t 0/1 Completed 0 6m23s 10.104.4.32 4am-node11 <none> <none>
qtp-1b-test-lbcyr-pulsar-recovery-0 1/1 Running 0 6m23s 10.104.4.39 4am-node11 <none> <none>
qtp-1b-test-lbcyr-pulsar-zookeeper-0 1/1 Running 0 6m23s 10.104.5.106 4am-node12 <none> <none>
qtp-1b-test-lbcyr-pulsar-zookeeper-1 1/1 Running 0 5m44s 10.104.6.237 4am-node13 <none> <none>
qtp-1b-test-lbcyr-pulsar-zookeeper-2 1/1 Running 0 5m2s 10.104.4.54 4am-node11 <none> <none>
datanode memory usage:
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
can we change the pod limit to 2G and increase the ingest concurrency and see if it OOMed?
/assign @elstic please retry as suggested above, and did this run with datanode.memory.forceSynceEnable=True?
can we change the pod limit to 2G and increase the ingest concurrency and see if it OOMed?
/assign @elstic please retry as suggested above, and did this run with datanode.memory.forceSynceEnable=True?
I will retry and comment the results here. @yanliang567 Yes, forceEnable is true on images starting with 2.2.0
can we change the pod limit to 2G and increase the ingest concurrency and see if it OOMed?
Set limit 2G and no OOM occurs. Just insert deny . use image: 2.2.0-20230412-51f5a128
client error log :
[2023-04-13 15:45:28,238 - INFO - fouram]: [Base] Start inserting, ids: 69450000 - 69499999, data size: 1,000,000,000 (base.py:157)
[2023-04-13 15:45:29,277 - INFO - fouram]: [Time] Collection.insert run in 1.0378s (api_request.py:41)
[2023-04-13 15:45:29,279 - INFO - fouram]: [Base] Number of vectors in the collection(fouram_4sfUyz8B): 69400000 (base.py:305)
[2023-04-13 15:45:30,629 - INFO - fouram]: [Base] Start inserting, ids: 69500000 - 69549999, data size: 1,000,000,000 (base.py:157)
[2023-04-13 15:45:31,350 - ERROR - fouram]: RPC error: [batch_insert], <MilvusException: (code=53, message=deny to write, reason: memory quota exhausted, please allocate more resources, req: /milvus.proto.milvus.MilvusService/Insert)>, <Time:{'RPC start': '2023-04-13 15:45:30.652223', 'RPC error': '2023-04-13 15:45:31.350141'}> (decorators.py:108)
[2023-04-13 15:45:31,504 - ERROR - fouram]: Traceback (most recent call last):
File "/src/fouram/client/util/api_request.py", line 33, in inner_wrapper
res = func(*args, **kwargs)
File "/src/fouram/client/util/api_request.py", line 70, in api_request
return func(*arg, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pymilvus/orm/collection.py", line 430, in insert
res = conn.batch_insert(self._name, entities, partition_name,
File "/usr/local/lib/python3.8/dist-packages/pymilvus/decorators.py", line 109, in handler
raise e
pymilvus.exceptions.MilvusException: <MilvusException: (code=53, message=deny to write, reason: memory quota exhausted, please allocate more resources, req: /milvus.proto.milvus.MilvusService/Insert)>
(api_request.py:48)
[2023-04-13 15:45:31,505 - ERROR - fouram]: (api_response) : <MilvusException: (code=53, message=deny to write, reason: memory quota exhausted, please allocate more resources, req: /milvus.proto.milvus.MilvusService/Insert)> (api_request.py:49)
[2023-04-13 15:45:31,505 - ERROR - fouram]: [CheckFunc] insert request check failed, response:<MilvusException: (code=53, message=deny to write, reason: memory quota exhausted, please allocate more resources, req: /milvus.proto.milvus.MilvusService/Insert)> (func_check.py:49)
server:
fouramf-9m29m-3-9553-etcd-0 1/1 Running 0 6m4s 10.104.6.112 4am-node13 <none> <none>
fouramf-9m29m-3-9553-etcd-1 1/1 Running 0 6m4s 10.104.9.38 4am-node14 <none> <none>
fouramf-9m29m-3-9553-etcd-2 1/1 Running 0 6m4s 10.104.5.58 4am-node12 <none> <none>
fouramf-9m29m-3-9553-milvus-datacoord-559f6ff7f5-wppvc 1/1 Running 1 (2m3s ago) 6m4s 10.104.6.99 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-datanode-599d7f56f4-2hqrp 1/1 Running 1 (2m2s ago) 6m4s 10.104.6.103 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-indexcoord-57658f74-wsd79 1/1 Running 1 (2m3s ago) 6m4s 10.104.6.98 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-indexnode-5b47f64857-q4v7d 1/1 Running 0 6m4s 10.104.1.171 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-proxy-6fdb748466-kjtsw 1/1 Running 1 (2m2s ago) 6m4s 10.104.1.172 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querycoord-65b45474f-9dfb4 1/1 Running 1 (2m2s ago) 6m4s 10.104.1.169 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-2d8x4 1/1 Running 0 6m4s 10.104.9.35 4am-node14 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-44qzk 1/1 Running 0 6m4s 10.104.6.106 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-fdbqb 1/1 Running 0 6m4s 10.104.5.55 4am-node12 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-h8cwd 0/1 Pending 0 6m4s <none> <none> <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-l7zll 1/1 Running 0 6m4s 10.104.1.174 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-r2vtb 1/1 Running 0 6m4s 10.104.4.211 4am-node11 <none> <none>
fouramf-9m29m-3-9553-milvus-rootcoord-69c697f96c-fpqg9 1/1 Running 1 (2m2s ago) 6m4s 10.104.6.104 4am-node13 <none> <none>
fouramf-9m29m-3-9553-minio-0 1/1 Running 0 6m4s 10.104.6.116 4am-node13 <none> <none>
fouramf-9m29m-3-9553-minio-1 1/1 Running 0 6m3s 10.104.1.183 4am-node10 <none> <none>
fouramf-9m29m-3-9553-minio-2 1/1 Running 0 6m3s 10.104.5.60 4am-node12 <none> <none>
fouramf-9m29m-3-9553-minio-3 1/1 Running 0 6m3s 10.104.9.42 4am-node14 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-0 1/1 Running 0 6m4s 10.104.6.114 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-1 1/1 Running 0 6m4s 10.104.9.40 4am-node14 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-2 1/1 Running 0 6m3s 10.104.1.184 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-init-rxn5h 0/1 Completed 0 6m4s 10.104.6.101 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-broker-0 1/1 Running 0 6m4s 10.104.6.97 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-proxy-0 1/1 Running 0 6m4s 10.104.1.173 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-pulsar-init-ln7hh 0/1 Completed 0 6m4s 10.104.6.102 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-recovery-0 1/1 Running 0 6m4s 10.104.6.105 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-0 1/1 Running 0 6m4s 10.104.6.113 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-1 1/1 Running 0 5m6s 10.104.1.186 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-2 1/1 Running 0 4m17s 10.104.4.215 4am-node11 <none> <none> (base.py:173)
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
fouramf-9m29m-3-9553-etcd-0 1/1 Running 0 57m 10.104.6.112 4am-node13 <none> <none>
fouramf-9m29m-3-9553-etcd-1 1/1 Running 0 57m 10.104.9.38 4am-node14 <none> <none>
fouramf-9m29m-3-9553-etcd-2 1/1 Running 0 57m 10.104.5.58 4am-node12 <none> <none>
fouramf-9m29m-3-9553-milvus-datacoord-559f6ff7f5-wppvc 1/1 Running 1 (53m ago) 57m 10.104.6.99 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-datanode-599d7f56f4-2hqrp 0/1 Running 2 (9s ago) 57m 10.104.6.103 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-indexcoord-57658f74-wsd79 1/1 Running 1 (53m ago) 57m 10.104.6.98 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-indexnode-5b47f64857-q4v7d 1/1 Running 0 57m 10.104.1.171 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-proxy-6fdb748466-kjtsw 1/1 Running 1 (53m ago) 57m 10.104.1.172 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querycoord-65b45474f-9dfb4 1/1 Running 1 (53m ago) 57m 10.104.1.169 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-2d8x4 1/1 Running 0 57m 10.104.9.35 4am-node14 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-44qzk 1/1 Running 0 57m 10.104.6.106 4am-node13 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-fdbqb 1/1 Running 0 57m 10.104.5.55 4am-node12 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-h8cwd 1/1 Running 0 57m 10.104.5.62 4am-node12 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-l7zll 1/1 Running 0 57m 10.104.1.174 4am-node10 <none> <none>
fouramf-9m29m-3-9553-milvus-querynode-846997cfbf-r2vtb 1/1 Running 0 57m 10.104.4.211 4am-node11 <none> <none>
fouramf-9m29m-3-9553-milvus-rootcoord-69c697f96c-fpqg9 1/1 Running 1 (53m ago) 57m 10.104.6.104 4am-node13 <none> <none>
fouramf-9m29m-3-9553-minio-0 1/1 Running 0 57m 10.104.6.116 4am-node13 <none> <none>
fouramf-9m29m-3-9553-minio-1 1/1 Running 0 57m 10.104.1.183 4am-node10 <none> <none>
fouramf-9m29m-3-9553-minio-2 1/1 Running 0 57m 10.104.5.60 4am-node12 <none> <none>
fouramf-9m29m-3-9553-minio-3 1/1 Running 0 57m 10.104.9.42 4am-node14 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-0 1/1 Running 0 57m 10.104.6.114 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-1 1/1 Running 0 57m 10.104.9.40 4am-node14 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-2 1/1 Running 0 57m 10.104.1.184 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-bookie-init-rxn5h 0/1 Completed 0 57m 10.104.6.101 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-broker-0 1/1 Running 0 57m 10.104.6.97 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-proxy-0 1/1 Running 0 57m 10.104.1.173 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-pulsar-init-ln7hh 0/1 Completed 0 57m 10.104.6.102 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-recovery-0 1/1 Running 0 57m 10.104.6.105 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-0 1/1 Running 0 57m 10.104.6.113 4am-node13 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-1 1/1 Running 0 56m 10.104.1.186 4am-node10 <none> <none>
fouramf-9m29m-3-9553-pulsar-zookeeper-2 1/1 Running 0 55m 10.104.4.215 4am-node11 <none> <none>
/assign @jiaoew1991 /unassign @elstic @yanliang567
/assign @bigsheeper
could you take a look why the force flush is not happen?
working on it
Test 1 Using pprof to analyze DataNode memory. Result:
- About 50% memory consumed by insertBuffer;
- About 50% memory consumed by msgstream receive buffer;
Test 2 Setting msgstream receive buffer size from 1024 to 10, then using pprof to analyze. Result:
- About 90% memory consumed by insertBuffer;
InsertBuffer consumed about 2GB, however, DataNode memory usage is about 5GB, so I guess C/C++ consumed a lot of memory which we still didn’t know.
Test 3 I assume that the arrow payload writer in C++ consumed a lot of memroy, so I added a log to print the arrow memory pool size. Result:
- Arrow memory pool consumed about only 80MB.
I'll try jeprof/heaptrack to analyze the C/C++ memory.
Test 4 I'm attempting to replicate this issue on my local machine and I have observed that DataNode's memory usage increases gradully over time. I used heaptrack to analyze. Result:
- There is an unusual cmalloc memory consumption in the FlameGraph.
https://github.com/milvus-io/milvus/pull/23138 upgraded arrow, may resolve this issue, please help to make a verify @elstic
/assign @elstic
/assign @elstic
After verification, datanode memory rise still exists . use image: 2.2.0-20230525-ef1a671d argo task : fouramf-4vkjr
/unassign @bigsheeper
https://github.com/milvus-io/milvus/pull/24656 with go payload wirter, datanode oom issue may be resolved.
/assign @elstic please help to make a verify @elstic
This issue still exists.
image: 2.2.0-20230608-a03ebcff
this is as expected? unless datanode reaches certain threshold ton do auto flush?
this is as expected? unless datanode reaches certain threshold ton do auto flush?
This is not expected, before we 20 concurrently inserted data, daatanode memory does not rise
datanode should reserve pod total memory * 0.5, anything below this is as expected
we only flush under memory pressure. If the pod memory limit is large, milvus will try to use 50% of the memory as data cache
if your data pod is 8g memory, we would expect to use a little bit more than 4GB memory
This issue still exists. Reboot after datanode memory reaches limit .
case: test_concurrent_locust_100m_hnsw_ddl_dql_filter_output_kafka_cluster argo task : fouramf-t24hx during time : 192h
memory usage :
datanode memory usage:
Steps:
1. create a collection or use an existing collection
2. build an HNSW index on the vector column
3. insert 100 million vectors
4. flush collection
5. build index on vector column with the same parameters
6. count the total number of rows
7. load collection
8. execute concurrent search, query,load,scene_test
(scene_test steps:
1) Create a collection 2) Insert 3000 pieces of data 3) flush collection
4) Create an index 5) drop collection)
9. step 8 lasts 192h
server:
fouramf-t24hx-30-7593-etcd-0 1/1 Running 0 8d 10.104.17.130 4am-node23 <none> <none>
fouramf-t24hx-30-7593-etcd-1 1/1 Running 0 8d 10.104.4.163 4am-node11 <none> <none>
fouramf-t24hx-30-7593-etcd-2 1/1 Running 0 8d 10.104.14.213 4am-node18 <none> <none>
fouramf-t24hx-30-7593-kafka-0 1/1 Running 1 (8d ago) 8d 10.104.14.212 4am-node18 <none> <none>
fouramf-t24hx-30-7593-kafka-1 1/1 Running 0 8d 10.104.1.104 4am-node10 <none> <none>
fouramf-t24hx-30-7593-kafka-2 1/1 Running 0 8d 10.104.13.175 4am-node16 <none> <none>
fouramf-t24hx-30-7593-milvus-datacoord-6d67b686b5-xnx96 1/1 Running 0 8d 10.104.19.52 4am-node28 <none> <none>
fouramf-t24hx-30-7593-milvus-datanode-7c469b8bdc-vstfh 1/1 Running 2 (2d2h ago) 8d 10.104.19.53 4am-node28 <none> <none>
fouramf-t24hx-30-7593-milvus-indexcoord-6c847cf9f8-4r6mm 1/1 Running 0 8d 10.104.4.160 4am-node11 <none> <none>
fouramf-t24hx-30-7593-milvus-indexnode-86bdff87b7-nqrgn 1/1 Running 0 8d 10.104.17.124 4am-node23 <none> <none>
fouramf-t24hx-30-7593-milvus-proxy-6958c8fcb4-q7m2v 1/1 Running 0 8d 10.104.4.158 4am-node11 <none> <none>
fouramf-t24hx-30-7593-milvus-querycoord-7595cdff77-p9w8d 1/1 Running 0 8d 10.104.19.54 4am-node28 <none> <none>
fouramf-t24hx-30-7593-milvus-querynode-7f6d9d6bf5-p6tk6 1/1 Running 0 8d 10.104.19.55 4am-node28 <none> <none>
fouramf-t24hx-30-7593-milvus-querynode-7f6d9d6bf5-vt4gq 1/1 Running 0 8d 10.104.4.161 4am-node11 <none> <none>
fouramf-t24hx-30-7593-milvus-rootcoord-7b4cb8fd4c-kz976 1/1 Running 0 8d 10.104.19.51 4am-node28 <none> <none>
fouramf-t24hx-30-7593-minio-0 1/1 Running 0 8d 10.104.17.132 4am-node23 <none> <none>
fouramf-t24hx-30-7593-minio-1 1/1 Running 0 8d 10.104.4.165 4am-node11 <none> <none>
fouramf-t24hx-30-7593-minio-2 1/1 Running 0 8d 10.104.5.186 4am-node12 <none> <none>
fouramf-t24hx-30-7593-minio-3 1/1 Running 0 8d 10.104.21.223 4am-node24 <none>
didn't really understand. Datanode memory is only a few giga bytes. It rise becasue datanode accumulate data in memory and decrease when flush happened
3g memory seems to be very reasonable. You can tune the param of datanode to decrease memory usage
didn't really understand. Datanode memory is only a few giga bytes. It rise becasue datanode accumulate data in memory and decrease when flush happened
Actually the memory usage decreased not because of flush, but OOM and restarted.
didn't really understand. Datanode memory is only a few giga bytes. It rise becasue datanode accumulate data in memory and decrease when flush happened
Actually the memory usage decreased not because of flush, but OOM and restarted.
How much memory does datanode have?
datanode start to flush with 50% of the datanode allocated memory.
didn't really understand. Datanode memory is only a few giga bytes. It rise becasue datanode accumulate data in memory and decrease when flush happened
Actually the memory usage decreased not because of flush, but OOM and restarted.
How much memory does datanode have?
dataNode.resources.limits.cpu=2.0,dataNode.resources.limits.memory=4Gi,dataNode.resources.requests.cpu=2.0,dataNode.resources.requests.memory=3Gi