milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: Standalone pod restarted when running first test for reinstall

Open zhuwenxing opened this issue 1 year ago • 0 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Environment

- Milvus version:2.2.0-20230411-f1c78610
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

image

image

Expected Behavior

all testcases passed

Steps To Reproduce

No response

Milvus Log

[2023/04/11 11:17:42.976 +00:00] [ERROR] [grpcclient/client.go:330] ["ClientBase ReCall grpc second call get error"] [role=datacoord] [error="err: failed to connect 10.102.7.44:13333, reason: context deadline exceeded\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:329 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:330\ngithub.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.976 +00:00] [WARN] [rootcoord/quota_center.go:129] ["quotaCenter sync metrics failed"] [error="err: failed to connect 10.102.7.44:13333, reason: context deadline exceeded\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:329 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"]
[2023/04/11 11:17:42.976 +00:00] [WARN] [rootcoord/proxy_client_manager.go:239] ["proxy client is empty, GetMetrics will not send to any client"]
[2023/04/11 11:17:42.976 +00:00] [INFO] [rootcoord/root_coord.go:175] ["update rootcoord state"] [state=Abnormal]
[2023/04/11 11:17:42.976 +00:00] [INFO] [rootcoord/root_coord.go:689] ["stop rootcoord executor"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/root_coord.go:696] ["stop rootcoord scheduler"]
[2023/04/11 11:17:42.977 +00:00] [WARN] [retry/retry.go:44] ["retry func failed"] ["retry time"=0] [error="rpc error: code = Canceled desc = grpc: the client connection is closing"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [grpcclient/client.go:280] ["ClientBase grpc error, start to reset connection"] [role=querycoord] [error="rpc error: code = Canceled desc = grpc: the client connection is closing"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/import_manager.go:132] ["import manager context done, exit check sendOutTasksLoop"]
[2023/04/11 11:17:42.977 +00:00] [WARN] [sessionutil/session_util.go:412] ["keep alive"] [error="context done"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/root_coord.go:703] ["cancel rootcoord goroutines"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/import_manager.go:174] ["(in cleanupLoop) import manager context done, exit cleanupLoop"]
[2023/04/11 11:17:42.977 +00:00] [WARN] [grpcclient/client.go:318] ["ClientBase ReCall grpc first call get error"] [role=querycoord] [error="err: rpc error: code = Canceled desc = grpc: the client connection is closing\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:317 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:359 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:174 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func1\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/timeticksync.go:260] ["rootcoord context done"] [error="context canceled"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [sessionutil/session_util.go:696] ["liveness exits due to context done"]
[2023/04/11 11:17:42.977 +00:00] [INFO] [rootcoord/import_manager.go:150] ["import manager context done, exit check flipTaskStateLoop"]
[2023/04/11 11:17:42.977 +00:00] [WARN] [rootcoord/proxy_manager.go:110] ["stop watching etcd loop"]
[2023/04/11 11:17:42.977 +00:00] [WARN] [timerecord/time_recorder.go:128] ["long term checker [rootTtChecker] shutdown"]
[2023/04/11 11:17:42.979 +00:00] [WARN] [client/client.go:98] ["DataCoordClient, not existed in msess "] [key=datacoord] ["len of msess"=0]
[2023/04/11 11:17:42.979 +00:00] [INFO] [rootcoord/root_coord.go:711] ["revoke rootcoord session"]
[2023/04/11 11:17:42.979 +00:00] [INFO] [rootcoord/service.go:310] ["Rootcoord begin to stop grpc server"]
[2023/04/11 11:17:42.979 +00:00] [INFO] [rootcoord/service.go:313] ["Graceful stop grpc server..."]
[2023/04/11 11:17:42.979 +00:00] [ERROR] [grpcclient/client.go:158] ["failed to get client address"] [error="find no available datacoord, check datacoord state"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).connect\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:158\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).GetGrpcClient\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:131\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).callOnce\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:256\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:312\ngithub.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.979 +00:00] [WARN] [grpcclient/client.go:318] ["ClientBase ReCall grpc first call get error"] [role=datacoord] [error="err: find no available datacoord, check datacoord state\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:317 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"]
{"level":"warn","ts":"2023-04-11T11:17:42.979Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0009eee00/rocksmq-standalone-reinstall-170-etcd:2379","attempt":0,"error":"rpc error: code = Canceled desc = context canceled"}
{"level":"warn","ts":"2023-04-11T11:17:42.979Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0009eee00/rocksmq-standalone-reinstall-170-etcd:2379","attempt":0,"error":"rpc error: code = Canceled desc = context canceled"}
[2023/04/11 11:17:42.979 +00:00] [WARN] [client/client.go:93] ["DataCoordClient, getSessions failed"] [key=datacoord] [error="context canceled"]
[2023/04/11 11:17:42.979 +00:00] [WARN] [client/client.go:89] ["QueryCoordClient GetSessions failed"] [error="context canceled"]
[2023/04/11 11:17:42.979 +00:00] [ERROR] [grpcclient/client.go:158] ["failed to get client address"] [error="context canceled"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).connect\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:158\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).GetGrpcClient\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:131\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).callOnce\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:256\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:327\ngithub.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.979 +00:00] [ERROR] [grpcclient/client.go:158] ["failed to get client address"] [error="context canceled"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).connect\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:158\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).GetGrpcClient\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:131\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).callOnce\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:256\ngithub.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:327\ngithub.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:359\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func1\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.979 +00:00] [ERROR] [grpcclient/client.go:330] ["ClientBase ReCall grpc second call get error"] [role=datacoord] [error="err: context canceled\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:329 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:330\ngithub.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.979 +00:00] [ERROR] [grpcclient/client.go:330] ["ClientBase ReCall grpc second call get error"] [role=querycoord] [error="err: context canceled\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:329 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:359 github.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:174 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func1\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"] [stack="github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n\t/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:330\ngithub.com/milvus-io/milvus/internal/distributed/querycoord/client.(*Client).GetMetrics\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/client/client.go:359\ngithub.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func1\n\t/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:174\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57"]
[2023/04/11 11:17:42.979 +00:00] [WARN] [rootcoord/quota_center.go:129] ["quotaCenter sync metrics failed"] [error="err: context canceled\n, /go/src/github.com/milvus-io/milvus/internal/util/trace/stack_trace.go:51 github.com/milvus-io/milvus/internal/util/trace.StackTrace\n/go/src/github.com/milvus-io/milvus/internal/util/grpcclient/client.go:329 github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n/go/src/github.com/milvus-io/milvus/internal/distributed/datacoord/client/client.go:502 github.com/milvus-io/milvus/internal/distributed/datacoord/client.(*Client).GetMetrics\n/go/src/github.com/milvus-io/milvus/internal/rootcoord/quota_center.go:195 github.com/milvus-io/milvus/internal/rootcoord.(*QuotaCenter).syncMetrics.func2\n/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 golang.org/x/sync/errgroup.(*Group).Go.func1\n/usr/local/go/src/runtime/asm_amd64.s:1571 runtime.goexit\n"]
[2023/04/11 11:17:42.979 +00:00] [INFO] [server/global_rmq.go:88] ["Close Rocksmq!"]
[2023/04/11 11:17:42.979 +00:00] [WARN] [server/rocksmq_retention.go:110] ["Rocksmq retention finish!"]
[2023/04/11 11:17:42.979 +00:00] [INFO] [server/rocksmq_impl.go:552] ["Rocksmq destroy consumer group successfully "] [topic=by-dev-rootcoord-dml_74] [group=by-dev-dataNode-6-by-dev-rootcoord-dml_74_440719517397319762v0] [elapsed=0]
[2023/04/11 11:17:42.979 +00:00] [INFO] [server/rocksmq_impl.go:552] ["Rocksmq destroy consumer group successfully "] [topic=by-dev-rootcoord-dml_63] [group=by-dev-dataNode-6-by-dev-rootcoord-dml_63_440719517397318900v1] [elapsed=0]
[2023/04/11 11:17:42.979 +00:00] [INFO] [server/rocksmq_impl.go:552] ["Rocksmq destroy consumer group successfully "] [topic=by-dev-rootcoord-dml_16] [group=by-dev-dataNode-6-by-dev-rootcoord-dml_16_440719517395914527v0] [elapsed=0]

failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_for_release_cron/detail/deploy_test_for_release_cron/170/pipeline

log: artifacts-rocksmq-standalone-reinstall-170-server-logs (1).tar.gz

artifacts-rocksmq-standalone-reinstall-170-pytest-logs.tar.gz

Anything else?

No response

zhuwenxing avatar Apr 12 '23 03:04 zhuwenxing