milvus
milvus copied to clipboard
[Bug]: QueryNode fail to start
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version: 2.2.9
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka): pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Centos
- CPU/Memory: 6c32gb
- GPU:
- Others:
Current Behavior
I compiled milvus2.2.9 in my env successfully.But When I deloy the cluster with my binary and lib, querynode failed to start while the other components success to start. I got log from querynodes in the following:
`Welcome to use Milvus!
Version:
Built: Wed Jun 7 05:57:47 UTC 2023
GitCommit:
GoVersion: go version go1.18.8 linux/amd64
open pid file: /run/milvus/querynode.pid lock pid file: /run/milvus/querynode.pid [2023/06/08 01:46:31.748 +00:00] [INFO] [roles/roles.go:226] ["starting running Milvus components"] [2023/06/08 01:46:31.748 +00:00] [INFO] [roles/roles.go:152] ["Enable Jemalloc"] ["Jemalloc Path"=/milvus/lib/libjemalloc.so] [2023/06/08 01:46:31.748 +00:00] [INFO] [management/server.go:68] ["management listen"] [addr=:9091] [2023/06/08 01:46:31.769 +00:00] [INFO] [config/etcd_source.go:145] ["start refreshing configurations"] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/quota_param.go:745] ["init disk quota"] [diskQuota(MB)=+inf] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/quota_param.go:760] ["init disk quota per DB"] [diskQuotaPerCollection(MB)=1.7976931348623157e+308] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/component_param.go:1543] ["init segment max idle time"] [value=10m0s] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/component_param.go:1548] ["init segment min size from idle to sealed"] [value=16] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/component_param.go:1558] ["init segment max binlog file to sealed"] [value=32] [2023/06/08 01:46:31.770 +00:00] [INFO] [paramtable/component_param.go:1553] ["init segment expansion rate"] [value=1.25] [2023/06/08 01:46:31.775 +00:00] [INFO] [paramtable/base_table.go:142] ["cannot find etcd.endpoints"] [2023/06/08 01:46:31.775 +00:00] [INFO] [paramtable/hook_config.go:19] ["hook config"] [hook={}] [2023/06/08 01:46:31.776 +00:00] [ERROR] [querynode/query_node.go:188] ["load queryhook failed"] [error="fail to set the querynode plugin path"] [stack="github.com/milvus-io/milvus/internal/querynode.NewQueryNode\n\t/data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/querynode/query_node.go:188\ngithub.com/milvus-io/milvus/internal/distributed/querynode.NewServer\n\t/data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/distributed/querynode/service.go:83\ngithub.com/milvus-io/milvus/cmd/components.NewQueryNode\n\t/data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/components/query_node.go:40\ngithub.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n\t/data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/roles/roles.go:110"] [2023/06/08 01:46:31.797 +00:00] [INFO] [config/etcd_source.go:145] ["start refreshing configurations"] [2023/06/08 01:46:31.797 +00:00] [DEBUG] [paramtable/grpc_param.go:153] [initServerMaxSendSize] [role=querynode] [grpc.serverMaxSendSize=536870912] [2023/06/08 01:46:31.798 +00:00] [DEBUG] [paramtable/grpc_param.go:175] [initServerMaxRecvSize] [role=querynode] [grpc.serverMaxRecvSize=536870912] [2023/06/08 01:46:31.800 +00:00] [INFO] [querynode/service.go:106] [QueryNode] [port=21123] [2023/06/08 01:46:31.801 +00:00] [INFO] [querynode/service.go:122] ["QueryNode connect to etcd successfully"] [2023/06/08 01:46:31.902 +00:00] [INFO] [querynode/service.go:132] [QueryNode] [State=Initializing] [2023/06/08 01:46:31.902 +00:00] [INFO] [querynode/query_node.go:299] ["QueryNode session info"] [metaPath=by-dev/meta] [2023/06/08 01:46:31.902 +00:00] [INFO] [sessionutil/session_util.go:202] ["Session try to connect to etcd"] [2023/06/08 01:46:31.904 +00:00] [INFO] [sessionutil/session_util.go:217] ["Session connect to etcd success"] [2023/06/08 01:46:31.910 +00:00] [INFO] [sessionutil/session_util.go:300] ["Session get serverID success"] [key=id] [ServerId=411] [2023/06/08 01:46:31.929 +00:00] [INFO] [config/etcd_source.go:145] ["start refreshing configurations"] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/quota_param.go:745] ["init disk quota"] [diskQuota(MB)=+inf] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/quota_param.go:760] ["init disk quota per DB"] [diskQuotaPerCollection(MB)=1.7976931348623157e+308] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/component_param.go:1543] ["init segment max idle time"] [value=10m0s] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/component_param.go:1548] ["init segment min size from idle to sealed"] [value=16] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/component_param.go:1558] ["init segment max binlog file to sealed"] [value=32] [2023/06/08 01:46:31.930 +00:00] [INFO] [paramtable/component_param.go:1553] ["init segment expansion rate"] [value=1.25] [2023/06/08 01:46:31.934 +00:00] [INFO] [paramtable/base_table.go:142] ["cannot find etcd.endpoints"] [2023/06/08 01:46:31.934 +00:00] [INFO] [paramtable/hook_config.go:19] ["hook config"] [hook={}] {"level":"INFO","time":"2023/06/08 01:46:31.935 +00:00","caller":"logutil/logutil.go:165","message":"Log directory","configDir":""} {"level":"INFO","time":"2023/06/08 01:46:31.935 +00:00","caller":"logutil/logutil.go:166","message":"Set log file to ","path":""} {"level":"INFO","time":"2023/06/08 01:46:31.935 +00:00","caller":"querynode/query_node.go:209","message":"QueryNode init session","nodeID":411,"node address":"10.234.98.131:21123"} {"level":"INFO","time":"2023/06/08 01:46:31.935 +00:00","caller":"querynode/query_node.go:315","message":"QueryNode init rateCollector done","nodeID":411} {"level":"INFO","time":"2023/06/08 01:46:31.944 +00:00","caller":"storage/minio_chunk_manager.go:145","message":"minio chunk manager init success.","bucketname":"milvus-bucket","root":"file"} {"level":"INFO","time":"2023/06/08 01:46:31.944 +00:00","caller":"querynode/query_node.go:325","message":"queryNode try to connect etcd success","MetaRootPath":"by-dev/meta"} {"level":"INFO","time":"2023/06/08 01:46:31.944 +00:00","caller":"querynode/segment_loader.go:945","message":"SegmentLoader created","ioPoolSize":48,"cpuPoolSize":6} 2023-06-08 01:46:31,944 INFO [default] [KNOWHERE][SetBlasThreshold][milvus] Set faiss::distance_compute_blas_threshold to 16384 2023-06-08 01:46:31,945 INFO [default] [KNOWHERE][SetEarlyStopThreshold][milvus] Set faiss::early_stop_threshold to 0 2023-06-08 01:46:31,945 INFO [default] [KNOWHERE][SetStatisticsLevel][milvus] Set knowhere::STATISTICS_LEVEL to 0 2023-06-08 01:46:31,945 | DEBUG | default | [SERVER][operator()][milvus] Config easylogging with yaml file: /milvus/configs/easylogging.yaml 2023-06-08 01:46:31,946 | DEBUG | default | [SEGCORE][SegcoreSetSimdType][milvus] set config simd_type: auto 2023-06-08 01:46:31,946 | INFO | default | [KNOWHERE][SetSimdType][milvus] FAISS expect simdType::AUTO 2023-06-08 01:46:31,946 | INFO | default | [KNOWHERE][SetSimdType][milvus] FAISS hook AVX512 2023-06-08 01:46:31,946 | DEBUG | default | [SEGCORE][SetIndexSliceSize][milvus] set config index slice size(byte): 16777216 2023-06-08 01:46:31,946 | DEBUG | default | [SEGCORE][SetThreadCoreCoefficient][milvus] set thread pool core coefficient: 10 fatal error: unexpected signal during runtime execution [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x372ed75]
runtime stack: runtime.throw({0x40010ba?, 0x7f85846297d0?}) /opt/go/goroot/src/runtime/panic.go:992 +0x71 runtime.sigpanic() /opt/go/goroot/src/runtime/signal_unix.go:802 +0x389
goroutine 207 [syscall]: runtime.cgocall(0x3144350, 0xc0012072c0) /opt/go/goroot/src/runtime/cgocall.go:157 +0x5c fp=0xc001207258 sp=0xc001207220 pc=0x14788bc github.com/milvus-io/milvus/internal/util/initcore._Cfunc_InitRemoteChunkManagerSingleton({0x7f85770ffa20, 0x7f8577028910, 0x7f8577028940, 0x7f8577028930, 0x7f85770085b0, 0x7f85770085c8, 0x7f85770085d0, 0x0, 0x0, {0x0, ...}}) _cgo_gotypes.go:122 +0x5b fp=0xc0012072c0 sp=0xc001207258 pc=0x2b21b3b github.com/milvus-io/milvus/internal/util/initcore.InitRemoteChunkManager(0x5d39640) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/util/initcore/init_storage_config.go:71 +0x2e5 fp=0xc0012073f8 sp=0xc0012072c0 pc=0x2b220e5 github.com/milvus-io/milvus/internal/querynode.(*QueryNode).InitSegcore(0x4470038?) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/querynode/query_node.go:291 +0x23e fp=0xc001207478 sp=0xc0012073f8 pc=0x2de447e github.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init.func1() /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/querynode/query_node.go:346 +0x10e5 fp=0xc001207b50 sp=0xc001207478 pc=0x2de5865 sync.(*Once).doSlow(0x3f91fe6?, 0x14b8051?) /opt/go/goroot/src/sync/once.go:68 +0xc2 fp=0xc001207bb0 sp=0xc001207b50 pc=0x14ef022 sync.(*Once).Do(...) /opt/go/goroot/src/sync/once.go:59 github.com/milvus-io/milvus/internal/querynode.(*QueryNode).Init(0x3f91fe6?) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/querynode/query_node.go:297 +0x5b fp=0xc001207bf8 sp=0xc001207bb0 pc=0x2de473b github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init(0xc000a32420) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/distributed/querynode/service.go:133 +0x76e fp=0xc001207ee8 sp=0xc001207bf8 pc=0x2fb25ce github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).Run(0xc000a34301?) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/distributed/querynode/service.go:213 +0x25 fp=0xc001207f28 sp=0xc001207ee8 pc=0x2fb3925 github.com/milvus-io/milvus/cmd/components.(*QueryNode).Run(0x5d39640?) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/components/query_node.go:54 +0x1d fp=0xc001207f60 sp=0xc001207f28 pc=0x313015d github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1() /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/roles/roles.go:120 +0x182 fp=0xc001207fe0 sp=0xc001207f60 pc=0x3132d82 runtime.goexit() /opt/go/goroot/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc001207fe8 sp=0xc001207fe0 pc=0x14e1fc1 created by github.com/milvus-io/milvus/cmd/roles.runComponent[...] /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/roles/roles.go:104 +0x18a
goroutine 1 [chan receive]: github.com/milvus-io/milvus/cmd/roles.(*MilvusRoles).Run(0xc0006c7e58, 0x0, {0x0, 0x0}) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/roles/roles.go:351 +0xb0d github.com/milvus-io/milvus/cmd/milvus.(*run).execute(0xc000a2c4b0, {0xc00004e090?, 0x3, 0x3}, 0xc000a32240) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/milvus/run.go:112 +0x66e github.com/milvus-io/milvus/cmd/milvus.RunMilvus({0xc00004e090?, 0x3, 0x3}) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/milvus/milvus.go:60 +0x21e main.main() /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/cmd/main.go:26 +0x2e
goroutine 220 [chan receive]: github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc0008bd420) /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:69 +0x8b created by github.com/panjf2000/ants/v2.NewPool /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:137 +0x34a
goroutine 230 [IO wait]: internal/poll.runtime_pollWait(0x7f854e859028, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc0010ad000?, 0xc000062500?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Accept(0xc0010ad000) /opt/go/goroot/src/internal/poll/fd_unix.go:614 +0x22c net.(*netFD).accept(0xc0010ad000) /opt/go/goroot/src/net/fd_unix.go:172 +0x35 net.(*TCPListener).accept(0xc000c02408) /opt/go/goroot/src/net/tcpsock_posix.go:139 +0x28 net.(*TCPListener).Accept(0xc000c02408) /opt/go/goroot/src/net/tcpsock.go:288 +0x3d net/http.(*Server).Serve(0xc0008960e0, {0x4445be0, 0xc000c02408}) /opt/go/goroot/src/net/http/server.go:3039 +0x385 net/http.(*Server).ListenAndServe(0xc0008960e0) /opt/go/goroot/src/net/http/server.go:2968 +0x7d net/http.ListenAndServe(...) /opt/go/goroot/src/net/http/server.go:3222 github.com/milvus-io/milvus/internal/management.ServeHTTP.func1() /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/management/server.go:69 +0x151 created by github.com/milvus-io/milvus/internal/management.ServeHTTP /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/management/server.go:66 +0x25
goroutine 231 [select]: google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc0004f5680) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:112 +0x73 created by google.golang.org/grpc.newCCBalancerWrapper /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:73 +0x22a
goroutine 311 [select]: google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0009bd9f0, 0x1) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:407 +0x115 google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000a33380) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:534 +0x85 google.golang.org/grpc/internal/transport.newHTTP2Client.func3() /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:415 +0x65 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:413 +0x1f91
goroutine 204 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858f38, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc0005ec200?, 0xc0010ee000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc0005ec200, {0xc0010ee000, 0x8000, 0x8000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc0005ec200, {0xc0010ee000?, 0x3e081c0?, 0x1?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000472668, {0xc0010ee000?, 0x14c5000?, 0x801010601?}) /opt/go/goroot/src/net/net.go:183 +0x45 bufio.(*Reader).Read(0xc000a32060, {0xc000896200, 0x9, 0x18?}) /opt/go/goroot/src/bufio/bufio.go:236 +0x1b4 io.ReadAtLeast({0x442a600, 0xc000a32060}, {0xc000896200, 0x9, 0x9}, 0x9) /opt/go/goroot/src/io/io.go:331 +0x9a io.ReadFull(...) /opt/go/goroot/src/io/io.go:350 golang.org/x/net/http2.readFrameHeader({0xc000896200?, 0x9?, 0x54b09a2?}, {0x442a600?, 0xc000a32060?}) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x6e golang.org/x/net/http2.(*Framer).ReadFrame(0xc0008961c0) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x95 google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000001e0) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:1498 +0x414 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:365 +0x193f
goroutine 234 [syscall]: os/signal.signal_recv() /opt/go/goroot/src/runtime/sigqueue.go:151 +0x2f os/signal.loop() /opt/go/goroot/src/os/signal/signal_unix.go:23 +0x19 created by os/signal.Notify.func1.1 /opt/go/goroot/src/os/signal/signal.go:151 +0x2a
goroutine 275 [select]: github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc00131ee80) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:147 +0x9f created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1 /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:98 +0x5a
goroutine 205 [select]: google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0009bc230, 0x1) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:407 +0x115 google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc0005f2780) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:534 +0x85 google.golang.org/grpc/internal/transport.newHTTP2Client.func3() /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:415 +0x65 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:413 +0x1f91
goroutine 206 [select]: github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc00091e180) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:147 +0x9f created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1 /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:98 +0x5a
goroutine 208 [select]: google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc000b0a740) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:112 +0x73 created by google.golang.org/grpc.newCCBalancerWrapper /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:73 +0x22a
goroutine 294 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858a88, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc000bb8080?, 0xc000aac000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc000bb8080, {0xc000aac000, 0x8000, 0x8000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc000bb8080, {0xc000aac000?, 0x3e081c0?, 0x14efd01?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000c0e000, {0xc000aac000?, 0x0?, 0x800010601?}) /opt/go/goroot/src/net/net.go:183 +0x45 bufio.(*Reader).Read(0xc0003ee240, {0xc0008963c0, 0x9, 0x18?}) /opt/go/goroot/src/bufio/bufio.go:236 +0x1b4 io.ReadAtLeast({0x442a600, 0xc0003ee240}, {0xc0008963c0, 0x9, 0x9}, 0x9) /opt/go/goroot/src/io/io.go:331 +0x9a io.ReadFull(...) /opt/go/goroot/src/io/io.go:350 golang.org/x/net/http2.readFrameHeader({0xc0008963c0?, 0x9?, 0xdac9e0d?}, {0x442a600?, 0xc0003ee240?}) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x6e golang.org/x/net/http2.(*Framer).ReadFrame(0xc000896380) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x95 google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000005a0) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:1498 +0x414 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:365 +0x193f
goroutine 292 [select]: google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0009bd270, 0x1) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:407 +0x115 google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000a32e40) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:534 +0x85 google.golang.org/grpc/internal/transport.newHTTP2Client.func3() /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:415 +0x65 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:413 +0x1f91
goroutine 291 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858d58, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc00056e180?, 0xc0013be000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc00056e180, {0xc0013be000, 0x8000, 0x8000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc00056e180, {0xc0013be000?, 0x3e081c0?, 0x1?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc0001b7fa8, {0xc0013be000?, 0x14c5000?, 0x800010601?}) /opt/go/goroot/src/net/net.go:183 +0x45 bufio.(*Reader).Read(0xc000a32de0, {0xc0008962e0, 0x9, 0x18?}) /opt/go/goroot/src/bufio/bufio.go:236 +0x1b4 io.ReadAtLeast({0x442a600, 0xc000a32de0}, {0xc0008962e0, 0x9, 0x9}, 0x9) /opt/go/goroot/src/io/io.go:331 +0x9a io.ReadFull(...) /opt/go/goroot/src/io/io.go:350 golang.org/x/net/http2.readFrameHeader({0xc0008962e0?, 0x9?, 0x6f1cae7?}, {0x442a600?, 0xc000a32de0?}) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x6e golang.org/x/net/http2.(*Framer).ReadFrame(0xc0008962a0) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x95 google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0000003c0) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:1498 +0x414 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:365 +0x193f
goroutine 276 [select]: github.com/uber/jaeger-client-go.(*RemotelyControlledSampler).pollControllerWithTicker(0xc0002e4d00, 0xc0009bd540) /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/sampler_remote.go:144 +0x89 github.com/uber/jaeger-client-go.(*RemotelyControlledSampler).pollController(0xc0002e4d00) /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/sampler_remote.go:139 +0x6d created by github.com/uber/jaeger-client-go.NewRemotelyControlledSampler /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/sampler_remote.go:86 +0x15b
goroutine 279 [select]: github.com/uber/jaeger-client-go/utils.(*reconnectingUDPConn).reconnectLoop(0xc000868070, 0x0?) /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/utils/reconnecting_udp_conn.go:70 +0xbc created by github.com/uber/jaeger-client-go/utils.newReconnectingUDPConn /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/utils/reconnecting_udp_conn.go:60 +0x205
goroutine 280 [select]: github.com/uber/jaeger-client-go.(*remoteReporter).processQueue(0xc000b8f1a0) /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/reporter.go:296 +0xde created by github.com/uber/jaeger-client-go.NewRemoteReporter /opt/go/gopath/pkg/mod/github.com/uber/[email protected]+incompatible/reporter.go:237 +0x245
goroutine 281 [select]: google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc00052dd80) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:112 +0x73 created by google.golang.org/grpc.newCCBalancerWrapper /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:73 +0x22a
goroutine 284 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858c68, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc0005eda80?, 0xc000614c00?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Accept(0xc0005eda80) /opt/go/goroot/src/internal/poll/fd_unix.go:614 +0x22c net.(*netFD).accept(0xc0005eda80) /opt/go/goroot/src/net/fd_unix.go:172 +0x35 net.(*TCPListener).accept(0xc0001ca3c0) /opt/go/goroot/src/net/tcpsock_posix.go:139 +0x28 net.(*TCPListener).Accept(0xc0001ca3c0) /opt/go/goroot/src/net/tcpsock.go:288 +0x3d google.golang.org/grpc.(*Server).Serve(0xc0008c9880, {0x4445be0?, 0xc0001ca3c0}) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/server.go:780 +0x477 github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).startGrpcLoop(0xc000a32420, 0x5283) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/distributed/querynode/service.go:203 +0x8ff created by github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).init /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/distributed/querynode/service.go:124 +0x5dd
goroutine 307 [select]: google.golang.org/grpc.(*ccBalancerWrapper).watcher(0xc00091b8c0) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:112 +0x73 created by google.golang.org/grpc.newCCBalancerWrapper /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/balancer_conn_wrappers.go:73 +0x22a
goroutine 295 [select]: google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0010b2050, 0x1) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:407 +0x115 google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc0003ee4e0) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:534 +0x85 google.golang.org/grpc/internal/transport.newHTTP2Client.func3() /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:415 +0x65 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:413 +0x1f91
goroutine 310 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858998, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc00088c180?, 0xc000976000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc00088c180, {0xc000976000, 0x8000, 0x8000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc00088c180, {0xc000976000?, 0x3e081c0?, 0x1?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000ba74d8, {0xc000976000?, 0x14c5000?, 0x800010601?}) /opt/go/goroot/src/net/net.go:183 +0x45 bufio.(*Reader).Read(0xc000a33320, {0xc0004fe3c0, 0x9, 0x18?}) /opt/go/goroot/src/bufio/bufio.go:236 +0x1b4 io.ReadAtLeast({0x442a600, 0xc000a33320}, {0xc0004fe3c0, 0x9, 0x9}, 0x9) /opt/go/goroot/src/io/io.go:331 +0x9a io.ReadFull(...) /opt/go/goroot/src/io/io.go:350 golang.org/x/net/http2.readFrameHeader({0xc0004fe3c0?, 0x9?, 0xeda36b9?}, {0x442a600?, 0xc000a33320?}) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x6e golang.org/x/net/http2.(*Framer).ReadFrame(0xc0004fe380) /opt/go/gopath/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x95 google.golang.org/grpc/internal/transport.(*http2Client).reader(0xc0005d9a40) /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:1498 +0x414 created by google.golang.org/grpc/internal/transport.newHTTP2Client /opt/go/gopath/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_client.go:365 +0x193f
goroutine 296 [select]: github.com/milvus-io/milvus/internal/config.(*EtcdSource).refreshConfigurationsPeriodically(0xc0006f1580) /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:147 +0x9f created by github.com/milvus-io/milvus/internal/config.(*EtcdSource).GetConfigurations.func1 /data/jianwang25/roadmap/milvus-2.2.9/milvus-2.2.9/internal/config/etcd_source.go:98 +0x5a
goroutine 304 [chan receive]: github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc0008683f0) /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:69 +0x8b created by github.com/panjf2000/ants/v2.NewPool /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:137 +0x34a
goroutine 302 [IO wait]: internal/poll.runtime_pollWait(0x7f854e8588a8, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc0006f1a00?, 0xc00107f000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc0006f1a00, {0xc00107f000, 0x1000, 0x1000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc0006f1a00, {0xc00107f000?, 0x0?, 0x4?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000662070, {0xc00107f000?, 0x0?, 0x0?}) /opt/go/goroot/src/net/net.go:183 +0x45 net/http.(*persistConn).Read(0xc001068360, {0xc00107f000?, 0x14c0b80?, 0xc000371ec8?}) /opt/go/goroot/src/net/http/transport.go:1929 +0x4e bufio.(*Reader).fill(0xc0003ef7a0) /opt/go/goroot/src/bufio/bufio.go:106 +0x103 bufio.(*Reader).Peek(0xc0003ef7a0, 0x1) /opt/go/goroot/src/bufio/bufio.go:144 +0x5d net/http.(*persistConn).readLoop(0xc001068360) /opt/go/goroot/src/net/http/transport.go:2093 +0x1ac created by net/http.(*Transport).dialConn /opt/go/goroot/src/net/http/transport.go:1750 +0x173e
goroutine 303 [select]: net/http.(*persistConn).writeLoop(0xc001068360) /opt/go/goroot/src/net/http/transport.go:2392 +0xf5 created by net/http.(*Transport).dialConn /opt/go/goroot/src/net/http/transport.go:1751 +0x1791
goroutine 305 [chan receive]: github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc000868460) /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:69 +0x8b created by github.com/panjf2000/ants/v2.NewPool /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:137 +0x34a
goroutine 322 [chan receive]: github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc0008684d0) /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:69 +0x8b created by github.com/panjf2000/ants/v2.NewPool /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:137 +0x34a
goroutine 323 [chan receive]: github.com/panjf2000/ants/v2.(*Pool).purgePeriodically(0xc000868540) /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:69 +0x8b created by github.com/panjf2000/ants/v2.NewPool /opt/go/gopath/pkg/mod/github.com/panjf2000/ants/[email protected]/pool.go:137 +0x34a
goroutine 324 [IO wait]: internal/poll.runtime_pollWait(0x7f854e858b78, 0x72) /opt/go/goroot/src/runtime/netpoll.go:302 +0x89 internal/poll.(*pollDesc).wait(0xc00091f300?, 0xc00121a000?, 0x0) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:83 +0x32 internal/poll.(*pollDesc).waitRead(...) /opt/go/goroot/src/internal/poll/fd_poll_runtime.go:88 internal/poll.(*FD).Read(0xc00091f300, {0xc00121a000, 0x1000, 0x1000}) /opt/go/goroot/src/internal/poll/fd_unix.go:167 +0x25a net.(*netFD).Read(0xc00091f300, {0xc00121a000?, 0xc00150a6e0?, 0x14ef4de?}) /opt/go/goroot/src/net/fd_posix.go:55 +0x29 net.(*conn).Read(0xc000662800, {0xc00121a000?, 0x7f85487294c0?, 0x1483220?}) /opt/go/goroot/src/net/net.go:183 +0x45 net/http.(*connReader).Read(0xc001057c80, {0xc00121a000, 0x1000, 0x1000}) /opt/go/goroot/src/net/http/server.go:780 +0x16d bufio.(*Reader).fill(0xc0003efe00) /opt/go/goroot/src/bufio/bufio.go:106 +0x103 bufio.(*Reader).ReadSlice(0xc0003efe00, 0x0?) /opt/go/goroot/src/bufio/bufio.go:371 +0x2f bufio.(*Reader).ReadLine(0xc0003efe00) /opt/go/goroot/src/bufio/bufio.go:400 +0x27 net/textproto.(*Reader).readLineSlice(0xc0009171d0) /opt/go/goroot/src/net/textproto/reader.go:57 +0x99 net/textproto.(*Reader).ReadLine(...) /opt/go/goroot/src/net/textproto/reader.go:38 net/http.readRequest(0xc000662800?) /opt/go/goroot/src/net/http/request.go:1029 +0x79 net/http.(*conn).readRequest(0xc0010c4d20, {0x4447b10, 0xc0004f5cc0}) /opt/go/goroot/src/net/http/server.go:988 +0x24a net/http.(*conn).serve(0xc0010c4d20, {0x4447bb8, 0xc0010be8a0}) /opt/go/goroot/src/net/http/server.go:1891 +0x32b created by net/http.(*Server).Serve /opt/go/goroot/src/net/http/server.go:3071 +0x4db ` This seems to be a minio connecting issue.
the all commponents's status:
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response
This is mine compiling env:
@dzqoo It looks like there is something went wrong when initializing the RemoteChunkManager
.
Could you please provide the configuration for the storage part (with sensitive field masked)?
this mine minio configs: ` existingSecret: "" bucketName: "milvus-bucket" rootPath: file useIAM: false iamEndpoint: "" podDisruptionBudget: enabled: false resources: requests: memory: 4Gi cpu: 1
gcsgateway: enabled: false replicas: 1 gcsKeyJson: "/etc/credentials/gcs_key.json" projectId: ""
service: type: NodePort port: 9000 nodePort: 31900
persistence: enabled: true existingClaim: "" storageClass: accessMode: ReadWriteOnce size: 500Gi
livenessProbe: enabled: true initialDelaySeconds: 5 periodSeconds: 5 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5
readinessProbe: enabled: true initialDelaySeconds: 5 periodSeconds: 5 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 5
startupProbe: enabled: true initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 60`
this mine minio configs: ` existingSecret: "" bucketName: "milvus-bucket" rootPath: file useIAM: false iamEndpoint: "" podDisruptionBudget: enabled: false resources: requests: memory: 4Gi cpu: 1
gcsgateway: enabled: false replicas: 1 gcsKeyJson: "/etc/credentials/gcs_key.json" projectId: ""
service: type: NodePort port: 9000 nodePort: 31900
persistence: enabled: true existingClaim: "" storageClass: accessMode: ReadWriteOnce size: 500Gi
livenessProbe: enabled: true initialDelaySeconds: 5 periodSeconds: 5 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5
readinessProbe: enabled: true initialDelaySeconds: 5 periodSeconds: 5 timeoutSeconds: 1 successThreshold: 1 failureThreshold: 5
startupProbe: enabled: true initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 60`
Under the configs, the office's docker images started well....
Could you login to any milvus pods, paste the milvus.yaml
under /milvus/configs
directory?
Could you login to any milvus pods, paste the
milvus.yaml
under/milvus/configs
directory?
etcd:
endpoints:
- localhost:2379
rootPath: by-dev # The root path where data is stored in etcd
metaSubPath: meta # metaRootPath = rootPath + '/' + metaSubPath
kvSubPath: kv # kvRootPath = rootPath + '/' + kvSubPath
log:
# path is one of:
# - "default" as os.Stderr,
# - "stderr" as os.Stderr,
# - "stdout" as os.Stdout,
# - file path to append server logs to.
# please adjust in embedded Milvus: /tmp/milvus/logs/etcd.log
path: stdout
level: info # Only supports debug, info, warn, error, panic, or fatal. Default 'info'.
use:
# please adjust in embedded Milvus: true
embed: false # Whether to enable embedded Etcd (an in-process EtcdServer).
data:
# Embedded Etcd only.
# please adjust in embedded Milvus: /tmp/milvus/etcdData/
dir: default.etcd
ssl:
enabled: false # Whether to support ETCD secure connection mode
tlsCert: /path/to/etcd-client.pem # path to your cert file
tlsKey: /path/to/etcd-client-key.pem # path to your key file
tlsCACert: /path/to/ca.pem # path to your CACert file
# TLS min version
# Optional values: 1.0, 1.1, 1.2, 1.3。
# We recommend using version 1.2 and above
tlsMinVersion: 1.3
# Default value: etcd
# Valid values: [etcd, mysql]
metastore:
type: etcd
# Related configuration of mysql, used to store Milvus metadata.
mysql:
username: root
password: 123456
address: localhost
port: 3306
dbName: milvus_meta
driverName: mysql
maxOpenConns: 20
maxIdleConns: 5
# please adjust in embedded Milvus: /tmp/milvus/data/
localStorage:
path: /var/lib/milvus/data/
# Related configuration of MinIO/S3/GCS or any other service supports S3 API, which is responsible for data persistence for Milvus.
# We refer to the storage service as MinIO/S3 in the following description for simplicity.
minio:
address: localhost # Address of MinIO/S3
port: 9000 # Port of MinIO/S3
accessKeyID: minioadmin # accessKeyID of MinIO/S3
secretAccessKey: minioadmin # MinIO/S3 encryption string
useSSL: false # Access to MinIO/S3 with SSL
bucketName: "a-bucket" # Bucket name in MinIO/S3
rootPath: files # The root path where the message is stored in MinIO/S3
# Whether to use IAM role to access S3/GCS instead of access/secret keys
# For more infomation, refer to
# aws: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html
# gcp: https://cloud.google.com/storage/docs/access-control/iam
# aliyun (ack): https://www.alibabacloud.com/help/en/container-service-for-kubernetes/latest/use-rrsa-to-enforce-access-control
# aliyun (ecs): https://www.alibabacloud.com/help/en/elastic-compute-service/latest/attach-an-instance-ram-role
useIAM: false
# Cloud Provider of S3. Supports: "aws", "gcp", "aliyun".
# You can use "aws" for other cloud provider supports S3 API with signature v4, e.g.: minio
# You can use "gcp" for other cloud provider supports S3 API with signature v2
# You can use "aliyun" for other cloud provider uses virtual host style bucket
# When useIAM enabled, only "aws", "gcp", "aliyun" is supported for now
cloudProvider: aws
# Custom endpoint for fetch IAM role credentials. when useIAM is true & cloudProvider is "aws".
# Leave it empty if you want to use AWS default endpoint
iamEndpoint: ""
# Milvus supports three MQ: rocksmq(based on RockDB), Pulsar and Kafka, which should be reserved in config what you use.
# There is a note about enabling priority if we config multiple mq in this file
# 1. standalone(local) mode: rockskmq(default) > Pulsar > Kafka
# 2. cluster mode: Pulsar(default) > Kafka (rocksmq is unsupported)
# Related configuration of pulsar, used to manage Milvus logs of recent mutation operations, output streaming log, and provide log publish-subscribe services.
pulsar:
address: localhost # Address of pulsar
port: 6650 # Port of pulsar
webport: 80 # Web port of pulsar, if you connect direcly without proxy, should use 8080
maxMessageSize: 5242880 # 5 * 1024 * 1024 Bytes, Maximum size of each message in pulsar.
tenant: public
namespace: default
# If you want to enable kafka, needs to comment the pulsar configs
kafka:
producer:
client.id: dc
consumer:
client.id: dc1
# brokerList: localhost1:9092,localhost2:9092,localhost3:9092
# saslUsername: username
# saslPassword: password
# saslMechanisms: PLAIN
# securityProtocol: SASL_SSL
rocksmq:
# please adjust in embedded Milvus: /tmp/milvus/rdb_data
path: /var/lib/milvus/rdb_data # The path where the message is stored in rocksmq
rocksmqPageSize: 67108864 # 64 MB, 64 * 1024 * 1024 bytes, The size of each page of messages in rocksmq
retentionTimeInMinutes: 4320 # 3 days, 3 * 24 * 60 minutes, The retention time of the message in rocksmq.
retentionSizeInMB: 8192 # 8 GB, 8 * 1024 MB, The retention size of the message in rocksmq.
compactionInterval: 86400 # 1 day, trigger rocksdb compaction every day to remove deleted data
lrucacheratio: 0.06 # rocksdb cache memory ratio
# Related configuration of rootCoord, used to handle data definition language (DDL) and data control language (DCL) requests
rootCoord:
address: localhost
port: 53100
enableActiveStandby: false # Enable active-standby
dmlChannelNum: 16 # The number of dml channels created at system startup
maxDatabaseNum: 64 # Maximum number of database
maxPartitionNum: 4096 # Maximum number of partitions in a collection
minSegmentSizeToEnableIndex: 1024 # It's a threshold. When the segment size is less than this value, the segment will not be indexed
# (in seconds) Duration after which an import task will expire (be killed). Default 900 seconds (15 minutes).
# Note: If default value is to be changed, change also the default in: internal/util/paramtable/component_param.go
importTaskExpiration: 900
# (in seconds) Milvus will keep the record of import tasks for at least `importTaskRetention` seconds. Default 86400
# seconds (24 hours).
# Note: If default value is to be changed, change also the default in: internal/util/paramtable/component_param.go
importTaskRetention: 86400
# Related configuration of proxy, used to validate client requests and reduce the returned results.
proxy:
port: 19530
internalPort: 19529
http:
enabled: true # Whether to enable the http server
debug_mode: false # Whether to enable http server debug mode
timeTickInterval: 200 # ms, the interval that proxy synchronize the time tick
msgStream:
timeTick:
bufSize: 512
maxNameLength: 255 # Maximum length of name for a collection or alias
maxFieldNum: 64 # Maximum number of fields in a collection.
# As of today (2.2.0 and after) it is strongly DISCOURAGED to set maxFieldNum >= 64.
# So adjust at your risk!
maxDimension: 32768 # Maximum dimension of a vector
# It's strongly DISCOURAGED to set `maxShardNum` > 64.
maxShardNum: 16 # Maximum number of shards in a collection
maxTaskNum: 1024 # max task number of proxy task queue
# please adjust in embedded Milvus: false
ginLogging: true # Whether to produce gin logs.
grpc:
serverMaxRecvSize: 67108864 # 64M
serverMaxSendSize: 67108864 # 64M
clientMaxRecvSize: 104857600 # 100 MB, 100 * 1024 * 1024
clientMaxSendSize: 104857600 # 100 MB, 100 * 1024 * 1024
# Related configuration of queryCoord, used to manage topology and load balancing for the query nodes, and handoff from growing segments to sealed segments.
queryCoord:
address: localhost
port: 19531
autoHandoff: true # Enable auto handoff
autoBalance: true # Enable auto balance
balancer: ScoreBasedBalancer # Balancer to use
globalRowCountFactor: 0.1 # expert parameters, only used by scoreBasedBalancer
scoreUnbalanceTolerationFactor: 0.05 # expert parameters, only used by scoreBasedBalancer
reverseUnBalanceTolerationFactor: 1.3 #expert parameters, only used by scoreBasedBalancer
overloadedMemoryThresholdPercentage: 90 # The threshold percentage that memory overload
balanceIntervalSeconds: 60
memoryUsageMaxDifferencePercentage: 30
checkInterval: 10000
channelTaskTimeout: 60000 # 1 minute
segmentTaskTimeout: 120000 # 2 minute
distPullInterval: 500
loadTimeoutSeconds: 1800
checkHandoffInterval: 5000
taskMergeCap: 8
taskExecutionCap: 256
enableActiveStandby: false # Enable active-standby
refreshTargetsIntervalSeconds: 300
# Related configuration of queryNode, used to run hybrid search between vector and scalar data.
queryNode:
cacheSize: 32 # GB, default 32 GB, `cacheSize` is the memory used for caching data for faster query. The `cacheSize` must be less than system memory size.
port: 21123
loadMemoryUsageFactor: 3 # The multiply factor of calculating the memory usage while loading segments
enableDisk: true # enable querynode load disk index, and search on disk index
maxDiskUsagePercentage: 95
gracefulStopTimeout: 30
stats:
publishInterval: 1000 # Interval for querynode to report node information (milliseconds)
dataSync:
flowGraph:
maxQueueLength: 1024 # Maximum length of task queue in flowgraph
maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
# Segcore will divide a segment into multiple chunks to enbale small index
segcore:
chunkRows: 1024 # The number of vectors in a chunk.
knowhereThreadPoolNumRatio: 4 # Use more threads to make good use of SSD throughput
# Note: we have disabled segment small index since @2022.05.12. So below related configurations won't work.
# We won't create small index for growing segments and search on these segments will directly use bruteforce scan.
smallIndex:
nlist: 128 # small index nlist, recommend to set sqrt(chunkRows), must smaller than chunkRows/8
nprobe: 16 # nprobe to search small index, based on your accuracy requirement, must smaller than nlist
cache:
enabled: true
memoryLimit: 2147483648 # 2 GB, 2 * 1024 *1024 *1024
scheduler:
receiveChanSize: 10240
unsolvedQueueSize: 10240
# maxReadConcurrentRatio is the concurrency ratio of read task (search task and query task).
# Max read concurrency would be the value of `runtime.NumCPU * maxReadConcurrentRatio`.
# It defaults to 2.0, which means max read concurrency would be the value of runtime.NumCPU * 2.
# Max read concurrency must greater than or equal to 1, and less than or equal to runtime.NumCPU * 100.
maxReadConcurrentRatio: 2.0 # (0, 100]
cpuRatio: 10.0 # ratio used to estimate read task cpu usage.
# maxTimestampLag is the max ts lag between serviceable and guarantee timestamp.
# if the lag is larger than this config, scheduler will return error without waiting.
# the valid value is [3600, infinite)
maxTimestampLag: 86400
# read task schedule policy: fifo(by default), user-task-polling.
scheduleReadPolicy:
# fifo: A FIFO queue support the schedule.
# user-task-polling:
# The user's tasks will be polled one by one and scheduled.
# Scheduling is fair on task granularity.
# The policy is based on the username for authentication.
# And an empty username is considered the same user.
# When there are no multi-users, the policy decay into FIFO
name: fifo
# user-task-polling configure:
taskQueueExpire: 60 # 1 min by default, expire time of inner user task queue since queue is empty.
grouping:
enabled: true
maxNQ: 50000
topKMergeRatio: 10.0
indexCoord:
address: localhost
port: 31000
enableActiveStandby: false # Enable active-standby
minSegmentNumRowsToEnableIndex: 1024 # It's a threshold. When the segment num rows is less than this value, the segment will not be indexed
bindIndexNodeMode:
enable: false
address: "localhost:22930"
withCred: false
nodeID: 0
gc:
interval: 600 # gc interval in seconds
scheduler:
interval: 1000 # scheduler interval in Millisecond
indexNode:
port: 21121
enableDisk: true # enable index node build disk vector index
maxDiskUsagePercentage: 95
gracefulStopTimeout: 30
scheduler:
buildParallel: 1
dataCoord:
address: localhost
port: 13333
enableCompaction: true # Enable data segment compaction
enableGarbageCollection: true
enableActiveStandby: false # Enable active-standby
channel:
watchTimeoutInterval: 120 # Timeout on watching channels (in seconds). Datanode tickler update watch progress will reset timeout timer.
balanceSilentDuration: 300 # The duration before the channelBalancer on datacoord to run
balanceInterval: 360 #The interval for the channelBalancer on datacoord to check balance status
segment:
maxSize: 512 # Maximum size of a segment in MB
diskSegmentMaxSize: 2048 # Maximun size of a segment in MB for collection which has Disk index
# Minimum proportion for a segment which can be sealed.
# Sealing early can prevent producing large growing segments in case these segments might slow down our search/query.
# Segments that sealed early will be compacted into a larger segment (within maxSize) eventually.
sealProportion: 0.23
assignmentExpiration: 2000 # The time of the assignment expiration in ms
maxLife: 86400 # The max lifetime of segment in seconds, 24*60*60
# If a segment didn't accept dml records in `maxIdleTime` and the size of segment is greater than
# `minSizeFromIdleToSealed`, Milvus will automatically seal it.
maxIdleTime: 600 # The max idle time of segment in seconds, 10*60.
minSizeFromIdleToSealed: 16 # The min size in MB of segment which can be idle from sealed.
# The max number of binlog file for one segment, the segment will be sealed if
# the number of binlog file reaches to max value.
maxBinlogFileNumber: 32
smallProportion: 0.5 # The segment is considered as "small segment" when its # of rows is smaller than
# (smallProportion * segment max # of rows).
compactableProportion: 0.85 # A compaction will happen on small segments if the segment after compaction will have
# over (compactableProportion * segment max # of rows) rows.
# MUST BE GREATER THAN OR EQUAL TO <smallProportion>!!!
expansionRate: 1.25 # During compaction, the size of segment # of rows is able to exceed segment max # of rows by (expansionRate-1) * 100%.
compaction:
enableAutoCompaction: true
gc:
interval: 3600 # gc interval in seconds
missingTolerance: 86400 # file meta missing tolerance duration in seconds, 60*24
dropTolerance: 3600 # file belongs to dropped entity tolerance duration in seconds
dataNode:
port: 21124
dataSync:
flowGraph:
maxQueueLength: 1024 # Maximum length of task queue in flowgraph
maxParallelism: 1024 # Maximum number of tasks executed in parallel in the flowgraph
segment:
# Max buffer size to flush for a single segment.
insertBufSize: 16777216 # Bytes, 16 MB
# Max buffer size to flush del for a single channel
deleteBufBytes: 67108864 # Bytes, 64MB
# The period to sync segments if buffer is not empty.
syncPeriod: 600 # Seconds, 10min
memory:
forceSyncEnable: true # `true` to force sync if memory usage is too high
forceSyncSegmentNum: 1 # number of segments to sync, segments with top largest buffer will be synced.
watermarkStandalone: 0.2 # memory watermark for standalone, upon reaching this watermark, segments will be synced.
watermarkCluster: 0.5 # memory watermark for cluster, upon reaching this watermark, segments will be synced.
# Configures the system log output.
log:
level: debug # Only supports debug, info, warn, error, panic, or fatal. Default 'info'.
stdout: "true" # default true, print log to stdout
file:
# please adjust in embedded Milvus: /tmp/milvus/logs
rootPath: "" # root dir path to put logs, default "" means no log file will print
maxSize: 300 # MB
maxAge: 10 # Maximum time for log retention in day.
maxBackups: 20
format: text # text/json
grpc:
log:
level: WARNING
serverMaxRecvSize: 536870912 # 512MB
serverMaxSendSize: 536870912 # 512MB
clientMaxRecvSize: 104857600 # 100 MB, 100 * 1024 * 1024
clientMaxSendSize: 104857600 # 100 MB, 100 * 1024 * 1024
client:
dialTimeout: 200
keepAliveTime: 10000
keepAliveTimeout: 20000
maxMaxAttempts: 5
initialBackOff: 1.0
maxBackoff: 60.0
backoffMultiplier: 2.0
server:
retryTimes: 5 # retry times when receiving a grpc return value with a failure and retryable state code
# Configure the proxy tls enable.
tls:
serverPemPath: configs/cert/server.pem
serverKeyPath: configs/cert/server.key
caPemPath: configs/cert/ca.pem
common:
# Channel name generation rule: ${namePrefix}-${ChannelIdx}
chanNamePrefix:
cluster: "by-dev"
rootCoordTimeTick: "rootcoord-timetick"
rootCoordStatistics: "rootcoord-statistics"
rootCoordDml: "rootcoord-dml"
rootCoordDelta: "rootcoord-delta"
search: "search"
searchResult: "searchResult"
queryTimeTick: "queryTimeTick"
queryNodeStats: "query-node-stats"
# Cmd for loadIndex, flush, etc...
cmd: "cmd"
dataCoordStatistic: "datacoord-statistics-channel"
dataCoordTimeTick: "datacoord-timetick-channel"
dataCoordSegmentInfo: "segment-info-channel"
# Sub name generation rule: ${subNamePrefix}-${NodeID}
subNamePrefix:
rootCoordSubNamePrefix: "rootCoord"
proxySubNamePrefix: "proxy"
queryNodeSubNamePrefix: "queryNode"
dataNodeSubNamePrefix: "dataNode"
dataCoordSubNamePrefix: "dataCoord"
defaultPartitionName: "_default" # default partition name for a collection
defaultIndexName: "_default_idx" # default index name
retentionDuration: 0 # time travel reserved time, insert/delete will not be cleaned in this period. disable it by default
entityExpiration: -1 # Entity expiration in seconds, CAUTION make sure entityExpiration >= retentionDuration and -1 means never expire
gracefulTime: 5000 # milliseconds. it represents the interval (in ms) by which the request arrival time needs to be subtracted in the case of Bounded Consistency.
gracefulStopTimeout: 30 # seconds. it will force quit the server if the graceful stop process is not completed during this time.
# Default value: auto
# Valid values: [auto, avx512, avx2, avx, sse4_2]
# This configuration is only used by querynode and indexnode, it selects CPU instruction set for Searching and Index-building.
simdType: auto
indexSliceSize: 16 # MB
DiskIndex:
MaxDegree: 56
SearchListSize: 100
PQCodeBudgetGBRatio: 0.125
BuildNumThreadsRatio: 1.0
SearchCacheBudgetGBRatio: 0.10
LoadNumThreadRatio: 8.0
BeamWidthRatio: 4.0
# This parameter specify how many times the number of threads is the number of cores
threadCoreCoefficient : 10
# please adjust in embedded Milvus: local
storageType: minio
security:
authorizationEnabled: false
# The superusers will ignore some system check processes,
# like the old password verification when updating the credential
# superUsers:
# - "root"
# tls mode values [0, 1, 2]
# 0 is close, 1 is one-way authentication, 2 is two-way authentication.
tlsMode: 0
session:
ttl: 20 # ttl value when session granting a lease to register service
retryTimes: 30 # retry times when session sending etcd requests
ImportMaxFileSize: 17179869184 # 16 * 1024 * 1024 * 1024
# max file size to import for bulkInsert
# QuotaConfig, configurations of Milvus quota and limits.
# By default, we enable:
# 1. TT protection;
# 2. Memory protection.
# 3. Disk quota protection.
# You can enable:
# 1. DML throughput limitation;
# 2. DDL, DQL qps/rps limitation;
# 3. DQL Queue length/latency protection;
# 4. DQL result rate protection;
# If necessary, you can also manually force to deny RW requests.
quotaAndLimits:
enabled: true # `true` to enable quota and limits, `false` to disable.
limits:
maxCollectionNum: 65536
maxCollectionNumPerDB: 65536
# quotaCenterCollectInterval is the time interval that quotaCenter
# collects metrics from Proxies, Query cluster and Data cluster.
quotaCenterCollectInterval: 3 # seconds, (0 ~ 65536)
ddl: # ddl limit rates, default no limit.
enabled: false
collectionRate: -1 # qps, default no limit, rate for CreateCollection, DropCollection, LoadCollection, ReleaseCollection
partitionRate: -1 # qps, default no limit, rate for CreatePartition, DropPartition, LoadPartition, ReleasePartition
indexRate:
enabled: false
max: -1 # qps, default no limit, rate for CreateIndex, DropIndex
flushRate:
enabled: false
max: -1 # qps, default no limit, rate for flush
compactionRate:
enabled: false
max: -1 # qps, default no limit, rate for manualCompaction
# dml limit rates, default no limit.
# The maximum rate will not be greater than `max`.
dml:
enabled: false
insertRate:
collection:
max: -1 # MB/s, default no limit
max: -1 # MB/s, default no limit
deleteRate:
collection:
max: -1 # MB/s, default no limit
max: -1 # MB/s, default no limit
bulkLoadRate: # not support yet. TODO: limit bulkLoad rate
collection:
max: -1 # MB/s, default no limit
max: -1 # MB/s, default no limit
# dql limit rates, default no limit.
# The maximum rate will not be greater than `max`.
dql:
enabled: false
searchRate:
collection:
max: -1 # vps (vectors per second), default no limit
max: -1 # vps (vectors per second), default no limit
queryRate:
collection:
max: -1 # qps, default no limit
max: -1 # qps, default no limit
# limitWriting decides whether dml requests are allowed.
limitWriting:
# forceDeny `false` means dml requests are allowed (except for some
# specific conditions, such as memory of nodes to water marker), `true` means always reject all dml requests.
forceDeny: false
ttProtection:
enabled: false
# maxTimeTickDelay indicates the backpressure for DML Operations.
# DML rates would be reduced according to the ratio of time tick delay to maxTimeTickDelay,
# if time tick delay is greater than maxTimeTickDelay, all DML requests would be rejected.
maxTimeTickDelay: 300 # in seconds
memProtection:
enabled: true
# When memory usage > memoryHighWaterLevel, all dml requests would be rejected;
# When memoryLowWaterLevel < memory usage < memoryHighWaterLevel, reduce the dml rate;
# When memory usage < memoryLowWaterLevel, no action.
# memoryLowWaterLevel should be less than memoryHighWaterLevel.
dataNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in DataNodes
dataNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in DataNodes
queryNodeMemoryLowWaterLevel: 0.85 # (0, 1], memoryLowWaterLevel in QueryNodes
queryNodeMemoryHighWaterLevel: 0.95 # (0, 1], memoryHighWaterLevel in QueryNodes
growingSegmentsSizeProtection:
# 1. No action will be taken if the ratio of growing segments size is less than the low water level.
# 2. The DML rate will be reduced if the ratio of growing segments size is greater than the low water level and less than the high water level.
# 3. All DML requests will be rejected if the ratio of growing segments size is greater than the high water level.
enabled: false
lowWaterLevel: 0.2
highWaterLevel: 0.4
diskProtection:
# When the total file size of object storage is greater than `diskQuota`, all dml requests would be rejected;
enabled: true
diskQuota: -1 # MB, (0, +inf), default no limit
diskQuotaPerCollection: -1 # MB, (0, +inf), default no limit
# limitReading decides whether dql requests are allowed.
limitReading:
# forceDeny `false` means dql requests are allowed (except for some
# specific conditions, such as collection has been dropped), `true` means always reject all dql requests.
forceDeny: false
queueProtection:
enabled: false
# nqInQueueThreshold indicated that the system was under backpressure for Search/Query path.
# If NQ in any QueryNode's queue is greater than nqInQueueThreshold, search&query rates would gradually cool off
# until the NQ in queue no longer exceeds nqInQueueThreshold. We think of the NQ of query request as 1.
nqInQueueThreshold: -1 # int, default no limit
# queueLatencyThreshold indicated that the system was under backpressure for Search/Query path.
# If dql latency of queuing is greater than queueLatencyThreshold, search&query rates would gradually cool off
# until the latency of queuing no longer exceeds queueLatencyThreshold.
# The latency here refers to the averaged latency over a period of time.
queueLatencyThreshold: -1 # milliseconds, default no limit
resultProtection:
enabled: false
# maxReadResultRate indicated that the system was under backpressure for Search/Query path.
# If dql result rate is greater than maxReadResultRate, search&query rates would gradually cool off
# until the read result rate no longer exceeds maxReadResultRate.
maxReadResultRate: -1 # MB/s, default no limit
# coolOffSpeed is the speed of search&query rates cool off.
coolOffSpeed: 0.9 # (0, 1]
autoIndex:
params:
build: '{"M": 30,"efConstruction": 360,"index_type": "HNSW", "metric_type": "IP"}'
/assign @congqixia /unassign
I also encountered this panic problem, which is also a mirror image of centos. Is there any progress on this issue now?
- Milvus version: 2.2.9
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka): pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Centos
- CPU/Memory: 4c8g
- GPU: no
- Others:
@yanliang567
@congqixia any ideas?
@mrrtree Sorry for the late reply Quick question, which OSS service did you use when you encounter this problem?
@mrrtree Sorry for the late reply Quick question, which OSS service did you use when you encounter this problem?
minio。i guess the problem is openssl version, which is 1.0.2 in cenos7
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.