[Bug]: milvus-backup backup is successful, but restoration from the backup is failing
Current Behavior
milvus-backup check Succeed to connect to milvus and storage. Milvus version: 2.5.11 Storage: milvus-bucket: milvus-bucket milvus-rootpath: file backup-bucket: a-bucket backup-rootpath: backup mivlus-backup:0.5.4
Command Execution: When restoring the collection using: ./milvus-backup restore -n BookEmbeddingByXml_time_2025_04_30_17_39_58 -s _bak
Observed Behavior: The process gets stuck at this point:
Log Output: [2025/04/30 17:42:05.992 +08:00] [INFO] [restore/collection.go:968] ["bulk insert task status"] [backup_db_name=default] [backup_collection_name=BookEmbeddingByXml] [target_db_name=default] [target_collection_name=BookEmbeddingByXml_bak] [jobID=457681197306260455] [state=ImportStarted] [backup="[{"key":"failed_reason"},{"key":"progress_percent","value":"70"}]"]
Expected Behavior
No response
Steps To Reproduce
Environment
Anything else?
No response
@fengchen8556203 Could you check if the storage (object storage, i.e. MinIO) is full or some drives are offline?
Everything checked out fine, but despite the long wait, it took significantly more time to finish than the last version (2.3.11)
Based on the logs, the restore task is stuck at 70% with the status ImportStarted, which usually indicates that the data import phase has completed, and the system is now building indexes. Please check the logs of Milvus's datanode and indexnode to see if there are any errors or issues related to index building, as well as their load conditions during this process.
If the logs indicate any specific errors or unusual load, feel free to share them with us so we can assist you further.
I encountered the same problem, is there a solution for this problem?
No exception logs are obtained in Milvus's datanode and indexnode.If it is in the index creation state, this state lasts too long.
The restoration of 20 million records failed. The following is the error message:
[2025/09/05 14:12:33.613 +08:00] [ERROR] [core/backup_impl_restore_backup.go:114] ["execute restore collection fail"] [backupId=f2574991-8967-11f0-ae4a-7cc25575c23c] [error="backup: restore backup task execute fail, err: restore: execute restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: context deadline exceeded"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).RestoreBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_restore_backup.go:114\ngithub.com/zilliztech/milvus-backup/cmd/restore.(*options).run\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:133\ngithub.com/zilliztech/milvus-backup/cmd/restore.NewCmd.func1\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:161\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1015\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1148\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1071\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/cmd.go:37\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:20\nruntime.main\n\t/opt/hostedtoolcache/go/1.24.4/x64/src/runtime/proc.go:283"] Error: restore backup failed: backup: restore backup task execute fail, err: restore: execute restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: context deadline exceeded
I'm using milvus-backup v0.5.7. My milvus cluster is v2.5.6. Both are containerized. I also encountered the same issue.
Error: restore backup failed: backup: restore backup task execute fail, err: restore: execute restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: segment is not healthy
The following error will also occur: [2025/09/08 19:12:12.912 +08:00] [INFO] [restore/collection.go:728] ["bulk insert task state"] [restore_task_id=721e6d68-ab07-48e9-a157-5da5a2aa9100] [backup_ns=default.OpsMM_1536] [target_ns=default.OpsXX_1536] [jobID=460664905050889866] [state=ImportPending] [backup="[{"key":"failed_reason"},{"key":"progress_percent","value":"10"}]"]
Then it fails to run: [2025/09/08 19:38:43.254 +08:00] [ERROR] [errgroup/errgroup.go:130] ["restore coll failed"] [backup_name=backup0902test] [backup_path=backup/backup0902test] [target_ns=default.OpsMM_1536] [error="restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: segment is not healthy"] [stack="golang.org/x/sync/errgroup.(*Group).add.func1\n\t/home/runner/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:130"] [2025/09/08 19:38:43.254 +08:00] [ERROR] [core/backup_impl_restore_backup.go:158] ["restore task failed"] [backup_name=backup0902test] [backup_path=backup/backup0902test] [error="restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: segment is not healthy"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).executeRestoreBackupTask\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_restore_backup.go:158\ngithub.com/zilliztech/milvus-backup/core.(*BackupContext).RestoreBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_restore_backup.go:111\ngithub.com/zilliztech/milvus-backup/cmd/restore.(*options).run\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:133\ngithub.com/zilliztech/milvus-backup/cmd/restore.NewCmd.func1\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:161\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1015\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1148\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1071\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/cmd.go:37\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:20\nruntime.main\n\t/opt/hostedtoolcache/go/1.24.4/x64/src/runtime/proc.go:283"] [2025/09/08 19:38:43.255 +08:00] [ERROR] [core/backup_impl_restore_backup.go:114] ["execute restore collection fail"] [backupId=31abf5c3-87ef-11f0-a504-7cc25575c23c] [error="backup: restore backup task execute fail, err: restore: execute restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: segment is not healthy"] [stack="github.com/zilliztech/milvus-backup/core.(*BackupContext).RestoreBackup\n\t/home/runner/work/milvus-backup/milvus-backup/core/backup_impl_restore_backup.go:114\ngithub.com/zilliztech/milvus-backup/cmd/restore.(*options).run\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:133\ngithub.com/zilliztech/milvus-backup/cmd/restore.NewCmd.func1\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/restore/restore.go:161\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1015\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1148\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1071\ngithub.com/zilliztech/milvus-backup/cmd.Execute\n\t/home/runner/work/milvus-backup/milvus-backup/cmd/cmd.go:37\nmain.main\n\t/home/runner/work/milvus-backup/milvus-backup/main.go:20\nruntime.main\n\t/opt/hostedtoolcache/go/1.24.4/x64/src/runtime/proc.go:283"] Error: restore backup failed: backup: restore backup task execute fail, err: restore: execute restore: run collection task restore: wait collection worker pool restore: restore collection restore_collection: restore data: restore_collection: restore data v1: restore_collection: restore partition data v1: restore_collection: restore not L0 groups: restore_collection: restore not L0 segment v1: restore_collection: bulk insert failed: segment is not healthy