milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: Standalone pod panic during first deploy test

Open zhuwenxing opened this issue 1 year ago • 3 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:2.2.0-20230418-e1122c2a
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

image

image

[2023/04/18 11:22:16.544 +00:00] [WARN] [proxy/impl.go:2151] ["DescribeIndex failed to WaitToFinish"] [error="role indexcoord[nodeID: 4] is not serving, reason: sate code: Abnormal"] [traceID=9824b4d66f35632] [role=proxy] [MsgID=440878207164219394] [BeginTs=440878207164219394] [EndTs=440878207164219394] [db=] [collection=task_1_HNSW] [field=] ["index name"=]
[2023/04/18 11:22:16.544 +00:00] [INFO] [datacoord/services.go:279] ["received request to get collection statistics"] [traceID=589ae7b9e6f3405a] [module=DataCoord]
[2023/04/18 11:22:16.544 +00:00] [WARN] [datacoord/services.go:996] ["DataCoord.GetMetrics failed"] [traceID=4f66cbc368ff7cd1] [nodeID=3] [req="{\"metric_type\":\"system_info\"}"] [error="DataCoord 3 is not ready"]
[2023/04/18 11:22:16.544 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="role datacoord[nodeID: 3] is not serving, reason: sate code: Abnormal"]
[2023/04/18 11:22:16.544 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="role datacoord[nodeID: 3] is not serving, reason: sate code: Abnormal"]
[2023/04/18 11:22:16.567 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="role rootcoord[nodeID: 0] is not serving, reason: sate code: Abnormal"]
[2023/04/18 11:22:16.567 +00:00] [ERROR] [datanode/flow_graph_insert_buffer_node.go:503] ["insertBufferNode flushBufferData failed, err = All attempts results:\nattempt #1:role rootcoord[nodeID: 0] is not serving, reason: sate code: Abnormal\nattempt #2:context canceled\n"] [stack="github.com/milvus-io/milvus/internal/datanode.(*insertBufferNode).Sync\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/flow_graph_insert_buffer_node.go:503\ngithub.com/milvus-io/milvus/internal/datanode.(*insertBufferNode).Operate\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/flow_graph_insert_buffer_node.go:139\ngithub.com/milvus-io/milvus/internal/util/flowgraph.(*nodeCtx).work\n\t/go/src/github.com/milvus-io/milvus/internal/util/flowgraph/node.go:127"]
panic: insertBufferNode flushBufferData failed, err = All attempts results:
attempt #1:role rootcoord[nodeID: 0] is not serving, reason: sate code: Abnormal
attempt #2:context canceled


goroutine 14375 [running]:
github.com/milvus-io/milvus/internal/datanode.(*insertBufferNode).Sync(0xc006344820, 0xc01f569920?, {0xc00780fb58?, 0x1?, 0x1?}, 0xc00d11ca80)
	/go/src/github.com/milvus-io/milvus/internal/datanode/flow_graph_insert_buffer_node.go:504 +0x1285
github.com/milvus-io/milvus/internal/datanode.(*insertBufferNode).Operate(0xc006344820, {0xc017792140, 0x1, 0x1})
	/go/src/github.com/milvus-io/milvus/internal/datanode/flow_graph_insert_buffer_node.go:139 +0x485
github.com/milvus-io/milvus/internal/util/flowgraph.(*nodeCtx).work(0xc006328820)
	/go/src/github.com/milvus-io/milvus/internal/util/flowgraph/node.go:127 +0x2ac
created by github.com/milvus-io/milvus/internal/util/flowgraph.(*nodeCtx).Start
	/go/src/github.com/milvus-io/milvus/internal/util/flowgraph/node.go:71 +0x79

Expected Behavior

all test cases passed

Steps To Reproduce

No response

Milvus Log

failed job: deploy_test_for_release_cron/detail/deploy_test_for_release_cron/247

log: artifacts-rocksmq-standalone-reinstall-247-server-logs (1).tar.gz

artifacts-rocksmq-standalone-reinstall-247-pytest-logs.tar.gz

Anything else?

No response

zhuwenxing avatar Apr 19 '23 02:04 zhuwenxing

/assign @jiaoew1991 /unassign

yanliang567 avatar Apr 19 '23 08:04 yanliang567

Didn't receive any request from etcd, might be a network issue between milvus to etcd. 1 Is it stable? @zhuwenxing

smellthemoon avatar Apr 19 '23 11:04 smellthemoon

/assign @smellthemoon /unassign

jiaoew1991 avatar Apr 21 '23 01:04 jiaoew1991

/assign @zhuwenxing

smellthemoon avatar Aug 01 '23 10:08 smellthemoon

Not reproduced

zhuwenxing avatar Aug 01 '23 10:08 zhuwenxing