milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Bug]: [backup]When latest Milvus-Backup backs up 2.6 milvus, proxy panic: runtime error: invalid memory address or nil pointer dereference

Open qixuan0212 opened this issue 10 months ago • 4 comments

Is there an existing issue for this?

  • [x] I have searched the existing issues

Environment

- Milvus version: master-20250615-d35c33da-amd64
- Deployment mode(standalone or cluster):clsuter 
- MQ type(rocksmq, pulsar or kafka): both 
- SDK version(e.g. pymilvus v2.0.0rc2): 2.6.0rc139
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

[2025/06/16 17:30:50.529 +08:00] [INFO] [backup/collection.go:695] ["backup segment"] [db_name=default] [collection_name=customized_setup_query_1747220631] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [segment_id=458762767238150234] [2025/06/16 17:30:50.530 +08:00] [INFO] [backup/collection.go:695] ["backup segment"] [db_name=default] [collection_name=customized_setup_query_1747220631] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [segment_id=458762767238179122] [2025/06/16 17:30:50.530 +08:00] [INFO] [backup/task.go:119] ["backup all collections successfully"] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [2025/06/16 17:30:50.530 +08:00] [INFO] [backup/task.go:123] ["skip backup rbac"] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [2025/06/16 17:30:50.530 +08:00] [INFO] [backup/task.go:127] ["start backup rpc channel pos"] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [2025/06/16 17:30:50.530 +08:00] [INFO] [backup/task.go:393] ["try to get rpc channel pos"] [backup_id=964d5b86-4a94-11f0-9a90-0ebcfbebcb99] [rpc_channel=by-dev-replicate-msg] [2025/06/16 17:31:13.734 +08:00] [WARN] [grpclog/grpclog.go:155] ["[core][Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "10.104.18.18:19530", ServerName: "10.104.18.18:19530", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "error reading server preface: read tcp 172.16.20.50:49361->10.104.18.18:19530: use of closed network connection""] [2025/06/16 17:31:16.839 +08:00] [WARN] [grpclog/grpclog.go:155] ["[core][Channel #1 SubChannel #2]grpc: addrConn.createTransport failed to connect to {Addr: "10.104.18.18:19530", ServerName: "10.104.18.18:19530", BalancerAttributes: {"<%!p(pickfirstleaf.managedByPickfirstKeyType={})>": "<%!p(bool=true)>" }}. Err: connection error: desc = "error reading server preface: read tcp 172.16.20.50:49362->10.104.18.18:19530: use of closed network connection""]

Expected Behavior

backup success

Steps To Reproduce

milvus-backup(v0.5.6) do backup for 2.6 milvus(master-20250615-d35c33da-amd64)

Milvus Log

https://grafana-4am.zilliz.cc/explore?orgId=1&panes=%7B%22PRW%22:%7B%22datasource%22:%22vhI6Vw67k%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bcluster%3D%5C%224am%5C%22,namespace%3D%5C%22qa-milvus%5C%22,pod%3D~%5C%22qx2-addfield-qlytv-milvus-proxy-69b44cffbb-5mzsv%5C%22%7D%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22vhI6Vw67k%22%7D,%22editorMode%22:%22code%22,%22queryType%22:%22range%22%7D%5D,%22range%22:%7B%22from%22:%221750066230000%22,%22to%22:%221750066280000%22%7D%7D%7D&schemaVersion=1

Image

Anything else?

No response

qixuan0212 avatar Jun 16 '25 09:06 qixuan0212

/assign @chyezh

qixuan0212 avatar Jun 16 '25 10:06 qixuan0212

/assign @SimFG please help on check it

xiaofan-luan avatar Jun 16 '25 21:06 xiaofan-luan

@SimFG I just checked this with @chyezh yesterday.

SimFG avatar Jun 17 '25 02:06 SimFG

@xiaofan-luan We don't support the old-cdc interface at 2.6 right now. I will fix it by returning unimplemented error at this interface but not panic at proxy. The support of cdc at 2.6 with streaming service is in-designing.

chyezh avatar Jun 17 '25 02:06 chyezh

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

stale[bot] avatar Jul 17 '25 14:07 stale[bot]

keep it

yanliang567 avatar Jul 25 '25 03:07 yanliang567

Is there a way to address this and make successful backups with the milvus backup tool in the meantime? Is disabling cdc an option?

tmart-ops avatar Oct 16 '25 19:10 tmart-ops

Is there a way to address this and make successful backups with the milvus backup tool in the meantime? Is disabling cdc an option?

@tmart-ops yes, backup tool is on-fixing, we will support to backup without CDC first.

chyezh avatar Oct 17 '25 08:10 chyezh