milvus
milvus copied to clipboard
[Bug]: Milvus deploy may failed due to Minio status not ready
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:2.2.0-20230309-130ab6da
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
components of Milvus keep restart
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
image tag: 2.2.0-20230309-130ab6da failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-for-release-cron/detail/chaos-test-for-release-cron/2576/pipeline/ log: artifacts-querynode-pod-kill-2576-server-logs (1).tar.gz
Anything else?
No response
/assign @LoveEachDay Please take a look
failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-for-release-cron/detail/chaos-test-for-release-cron/2857/pipeline
log:
artifacts-proxy-pod-failure-2857-server-logs (1).tar.gz
API: SYSTEM()
Time: 21:47:55 UTC 03/20/2023
Error: Marking http://proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local:9000/minio/storage/export/v43 temporary offline; caused by Post "http://proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local:9000/minio/storage/export/v43/readall?disk-id=&file-path=format.json&volume=.minio.sys": lookup proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local on 10.101.0.10:53: no such host (*fmt.wrapError)
6: internal/rest/client.go:151:rest.(*Client).Call()
5: cmd/storage-rest-client.go:152:cmd.(*storageRESTClient).call()
4: cmd/storage-rest-client.go:520:cmd.(*storageRESTClient).ReadAll()
3: cmd/format-erasure.go:387:cmd.loadFormatErasure()
2: cmd/format-erasure.go:326:cmd.loadFormatErasureAll.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
Waiting for all other servers to be online to format the disks (elapses 2m59s)
API: SYSTEM()
Time: 21:47:55 UTC 03/20/2023
Error: Marking http://proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local:9000/minio/storage/export/v43 temporary offline; caused by Post "http://proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local:9000/minio/storage/export/v43/readall?disk-id=&file-path=format.json&volume=.minio.sys": lookup proxy-pod-failure-2857-minio-3.proxy-pod-failure-2857-minio-svc.chaos-testing.svc.cluster.local on 10.101.0.10:53: no such host (*fmt.wrapError)
6: internal/rest/client.go:151:rest.(*Client).Call()
5: cmd/storage-rest-client.go:152:cmd.(*storageRESTClient).call()
4: cmd/storage-rest-client.go:520:cmd.(*storageRESTClient).ReadAll()
3: cmd/format-erasure.go:387:cmd.loadFormatErasure()
2: cmd/format-erasure.go:326:cmd.loadFormatErasureAll.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
Waiting for all other servers to be online to format the disks (elapses 2m59s)
At case 2857, minio started success after 47:55 based on logs, but the datacoord has already exhausted retry times at 46:43, that's why the datacoord start failed
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.