milvus
milvus copied to clipboard
[Bug]: Querynode panic with error `runtime error: invalid memory address or nil pointer dereference`
Is there an existing issue for this?
- [X] I have searched the existing issues
Environment
- Milvus version:2.2.5-20230401-fd5bedbe
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
Expected Behavior
all components are healthy
Steps To Reproduce
No response
Milvus Log
[2023/04/02 11:20:25.510 +00:00] [INFO] [querynode/segment_loader.go:731] ["from dml check point load delete"] [position="channel_name:\"by-dev-rootcoord-dml_64\" msgID:\"\\010B\\020\\307\\001\\030\\000 \\000\" msgGroup:\"by-dev-dataNode-10-by-dev-rootcoord-dml_64_440515680902934859v0\" timestamp:440515767735681024 "] [subName=querynode-delta-loader-3-440515680902934859-2748845761490220111] [positionTs=2023/04/02 11:18:14.646 +00:00]
[2023/04/02 11:20:25.519 +00:00] [INFO] [querynode/impl_utils.go:72] ["ReleaseSegments start to transfer release with shard cluster"] [traceID=613821d389588a2e] [shard=by-dev-rootcoord-dml_16_440515680901730037v0] [segmentIDs="[440515680902935116]"] [scope=All]
[2023/04/02 11:20:25.520 +00:00] [INFO] [querynode/impl_utils.go:72] ["ReleaseSegments start to transfer release with shard cluster"] [traceID=34c81f23790d8b82] [shard=by-dev-rootcoord-dml_15_440515680901730033v1] [segmentIDs="[440515680902934890]"] [scope=All]
[2023/04/02 11:20:25.520 +00:00] [INFO] [querynode/impl_utils.go:72] ["ReleaseSegments start to transfer release with shard cluster"] [traceID=c8b8bb1a977f589] [shard=by-dev-rootcoord-dml_19_440515680901730041v1] [segmentIDs="[440515680902934812]"] [scope=All]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x2bfe95a]
goroutine 102385 [running]:
github.com/milvus-io/milvus/internal/querynode.(*distribution).RemoveDistributions(0x0, 0xc008c75290, {0xc0008a2a80, 0x1, 0xc008c75278?})
/go/src/github.com/milvus-io/milvus/internal/querynode/distribution.go:205 +0x5a
github.com/milvus-io/milvus/internal/querynode.(*ShardCluster).ReleaseSegments(0xc0018a8d80, {0x426adf8, 0xc008147770}, 0xc0089ec900, 0x0)
/go/src/github.com/milvus-io/milvus/internal/querynode/shard_cluster.go:604 +0x52c
github.com/milvus-io/milvus/internal/querynode.(*QueryNode).TransferRelease(0xc000290870, {0x426adf8, 0xc008147770}, 0xc0089ec900)
/go/src/github.com/milvus-io/milvus/internal/querynode/impl_utils.go:84 +0x3eb
github.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseSegments(0xc000290870, {0x426adf8, 0xc008147770}, 0xc0089ec900)
/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:642 +0x251
github.com/milvus-io/milvus/internal/distributed/querynode.(*Server).ReleaseSegments(0xf?, {0x426adf8?, 0xc008147770?}, 0xf?)
/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:302 +0x2e
github.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseSegments_Handler.func1({0x426adf8, 0xc008147770}, {0x3d38ce0?, 0xc0089ec900})
/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:5667 +0x78
github.com/milvus-io/milvus/internal/util/logutil.UnaryTraceLoggerInterceptor({0x426adf8?, 0xc0081476e0?}, {0x3d38ce0, 0xc0089ec900}, 0x4254f20?, 0xc0040a7fb0)
/go/src/github.com/milvus-io/milvus/internal/util/logutil/grpc_interceptor.go:22 +0x49
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x426adf8?, 0xc0081476e0?}, {0x3d38ce0?, 0xc0089ec900?})
/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:25 +0x3a
github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1({0x426adf8, 0xc008147650}, {0x3d38ce0, 0xc0089ec900}, 0xc0044f7740?, 0xc0044f7760)
/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/tracing/opentracing/server_interceptors.go:38 +0x16a
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1({0x426adf8?, 0xc008147650?}, {0x3d38ce0?, 0xc0089ec900?})
/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:25 +0x3a
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1({0x426adf8, 0xc008147650}, {0x3d38ce0, 0xc0089ec900}, 0xc005840af0?, 0x3aa9fc0?)
/go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:34 +0xbf
github.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseSegments_Handler({0x3d6d080?, 0xc000980c60}, {0x426adf8, 0xc008147650}, 0xc0094125a0, 0xc001816210)
/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:5669 +0x138
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000948e00, {0x427b650, 0xc00201e4e0}, 0xc0048ed320, 0xc001816390, 0x57540c0, 0x0)
/go/pkg/mod/google.golang.org/[email protected]/server.go:1283 +0xcfd
google.golang.org/grpc.(*Server).handleStream(0xc000948e00, {0x427b650, 0xc00201e4e0}, 0xc0048ed320, 0x0)
/go/pkg/mod/google.golang.org/[email protected]/server.go:1620 +0xa1b
google.golang.org/grpc.(*Server).serveStreams.func1.2()
/go/pkg/mod/google.golang.org/[email protected]/server.go:922 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
/go/pkg/mod/google.golang.org/[email protected]/server.go:920 +0x28a
milvus mode: cluster deploy task: reinstall old image tag: v2.2.3 new image tag: 2.2.5-20230401-fd5bedbe failed job: https://qa-jenkins.milvus.io/blue/organizations/jenkins/deploy_test_for_release_cron/detail/deploy_test_for_release_cron/131/pipeline log: artifacts-pulsar-cluster-reinstall-131-server-logs (2).tar.gz
artifacts-pulsar-cluster-reinstall-131-pytest-logs (1).tar.gz
Anything else?
No response
/unassign @yanliang567 /assign
Panic due to distribution
not setup when release request reaches the querynode. Actually the load has the same problem.
image: master-20230404-940ead20
My querynode also panic: runtime error: invalid memory address or nil pointer dereference. I don't know if it's the same problem. Since I don't have release
I just do the query with count(*)
panic logs:
274 [2023/04/06 03:50:58.806 +00:00] [DEBUG] [querynodev2/services.go:73] ["Get QueryNode component state done"] [stateCode=Healthy]
275 [2023/04/06 03:50:59.914 +00:00] [DEBUG] [pipeline/stream_pipeline.go:57] ["stream pipeline fetch msg"] [sum=0]
276 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/services.go:676] ["received query request"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [shards="[cluster-count-rootcoord-dml_2_440598985649691025v0]"] [outputFields="[]"] [segmentIDs ="[]"] [guaranteeTimestamp=440599329634254850] [travelTimestamp=440599330944974850]
277 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/services.go:676] ["received query request"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [shards="[cluster-count-rootcoord-dml_3_440598985649691025v1]"] [outputFields="[]"] [segmentIDs ="[]"] [guaranteeTimestamp=440599329634254850] [travelTimestamp=440599330944974850]
278 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/handlers.go:129] ["start do query with channel"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [msgID=440599330944974850] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_3_440598985649691025v1] [s cope=All] [fromShardLeader=false] [segmentIDs="[]"]
279 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/handlers.go:129] ["start do query with channel"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [msgID=440599330944974850] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_2_440598985649691025v0] [s cope=All] [fromShardLeader=false] [segmentIDs="[]"]
280 [2023/04/06 03:51:03.034 +00:00] [INFO] [delegator/delegator.go:279] ["query segments..."] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_3_440598985649691025v1] [replicaID=440598993217257473] [sealedNum =0] [growingNum=1]
281 [2023/04/06 03:51:03.034 +00:00] [INFO] [delegator/delegator.go:279] ["query segments..."] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_2_440598985649691025v0] [replicaID=440598993217257473] [sealedNum =0] [growingNum=1]
282 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/services.go:676] ["received query request"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [shards="[cluster-count-rootcoord-dml_2_440598985649691025v0]"] [outputFields="[]"] [segmentIDs ="[440598985649891069]"] [guaranteeTimestamp=440599329634254850] [travelTimestamp=440599330944974850]
283 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/services.go:676] ["received query request"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [shards="[cluster-count-rootcoord-dml_3_440598985649691025v1]"] [outputFields="[]"] [segmentIDs ="[440598985649891070]"] [guaranteeTimestamp=440599329634254850] [travelTimestamp=440599330944974850]
284 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/handlers.go:129] ["start do query with channel"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [msgID=440599330944974850] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_2_440598985649691025v0] [s cope=Streaming] [fromShardLeader=true] [segmentIDs="[440598985649891069]"]
285 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [querynodev2/handlers.go:129] ["start do query with channel"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [msgID=440599330944974850] [collectionID=440598985649691025] [channel=cluster-count-rootcoord-dml_3_440598985649691025v1] [s cope=Streaming] [fromShardLeader=true] [segmentIDs="[440598985649891070]"]
286 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/validate.go:52] ["read target partitions"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [partitionIDs="[440598985649691026]"]
287 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/validate.go:52] ["read target partitions"] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [collectionID=440598985649691025] [partitionIDs="[440598985649691026]"]
288 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/segment.go:401] ["do retrieve on segment"] [collectionID=440598985649691025] [partitionID=440598985649691026] [segmentID=440598985649891070] [msgID=440599330944974850] [segmentType=Growing]
289 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/segment.go:401] ["do retrieve on segment"] [collectionID=440598985649691025] [partitionID=440598985649691026] [segmentID=440598985649891069] [msgID=440599330944974850] [segmentType=Growing]
290 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/segment.go:414] ["retrieve result"] [collectionID=440598985649691025] [partitionID=440598985649691026] [segmentID=440598985649891070] [resultNum=0]
291 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/result.go:279] [mergeSegcoreRetrieveResults] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [limit=-1] [resultNum=1]
292 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/segment.go:414] ["retrieve result"] [collectionID=440598985649691025] [partitionID=440598985649691026] [segmentID=440598985649891069] [resultNum=0]
293 [2023/04/06 03:51:03.034 +00:00] [DEBUG] [segments/result.go:279] [mergeSegcoreRetrieveResults] [traceID=c5ace5cc04b1325fd38f813fb3b6eb6f] [limit=-1] [resultNum=1]
294 panic: runtime error: invalid memory address or nil pointer dereference
295 [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x19c9120]
296
297 goroutine 10241 [running]:
298 panic({0x3616a60, 0x52b91a0})
299 /usr/local/go/src/runtime/panic.go:941 +0x397 fp=0xc00232d418 sp=0xc00232d358 pc=0x12b9cf7
300 runtime.panicmem(...)
301 /usr/local/go/src/runtime/panic.go:220
302 runtime.sigpanic()
303 /usr/local/go/src/runtime/signal_unix.go:818 +0x336 fp=0xc00232d468 sp=0xc00232d418 pc=0x12d0d16
304 github.com/milvus-io/milvus/internal/util/typeutil.GetSizeOfIDs(0x3e21390?)
305 /go/src/github.com/milvus-io/milvus/internal/util/typeutil/schema.go:652 fp=0xc00232d470 sp=0xc00232d468 pc=0x19c9120
306 github.com/milvus-io/milvus/internal/querynodev2/segments.MergeSegcoreRetrieveResults({0x3e21390?, 0xc00290a9c0?}, {0xc001855da8, 0x1, 0x1?}, 0xffffffffffffffff)
307 /go/src/github.com/milvus-io/milvus/internal/querynodev2/segments/result.go:294 +0x2cd fp=0xc00232d6e0 sp=0xc00232d470 pc=0x2d61bed
308 github.com/milvus-io/milvus/internal/querynodev2/segments.MergeSegcoreRetrieveResultsAndFillIfEmpty({0x3e21390?, 0xc00290a9c0?}, {0xc001855da8?, 0xc000741da0?, 0x0?}, 0x0?, {0x0, 0x0, 0x0}, 0xc001fcdd40)
309 /go/src/github.com/milvus-io/milvus/internal/querynodev2/segments/result.go:367 +0x45 fp=0xc00232d738 sp=0xc00232d6e0 pc=0x2d622c5
310 github.com/milvus-io/milvus/internal/querynodev2.(*QueryNode).querySegments(0xc00109a5a0, {0x3e21390, 0xc00290a9c0}, 0xc002693260)
311 /go/src/github.com/milvus-io/milvus/internal/querynodev2/handlers.go:240 +0x2bd fp=0xc00232d810 sp=0xc00232d738 pc=0x2da7a1d
312 github.com/milvus-io/milvus/internal/querynodev2.(*QueryNode).queryChannel(0xc00109a5a0, {0x3e21390, 0xc00290a900}, 0xc002693260, {0xc001e59380, 0x32})
313 /go/src/github.com/milvus-io/milvus/internal/querynodev2/handlers.go:140 +0x966 fp=0xc00232df10 sp=0xc00232d810 pc=0x2da63a6
314 github.com/milvus-io/milvus/internal/querynodev2.(*QueryNode).Query.func1()
315 /go/src/github.com/milvus-io/milvus/internal/querynodev2/services.go:714 +0x50 fp=0xc00232df78 sp=0xc00232df10 pc=0x2db65f0
316 golang.org/x/sync/errgroup.(*Group).Go.func1()
317 /go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x64 fp=0xc00232dfe0 sp=0xc00232df78 pc=0x2699d24
318 runtime.goexit()
319 /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00232dfe8 sp=0xc00232dfe0 pc=0x12ef0c1
320 created by golang.org/x/sync/errgroup.(*Group).Go
321 /go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:72 +0xa5
@ThreadDao master shall has different problem here. Could you please open a new issue for it?
@ThreadDao master shall has different problem here. Could you please open a new issue for it?
ok. #23241
@zhuwenxing fix pr merged. Could you please verify? /unassign /assign @zhuwenxing
not reproduced in 2.2.6-20230417-41d9ab3d and 2.2.0-20230413-a66b77e4