milvus-backup icon indicating copy to clipboard operation
milvus-backup copied to clipboard

[Feature]: Faster restore process

Open Pheggas opened this issue 1 year ago • 5 comments

Is your feature request related to a problem? Please describe.

Hello. Currently, when i want to restore the backup, the restoration process takes too long even with higher concurrency levels (6). Currently i'm on 30 hours of restoration for 415 million.

Describe the solution you'd like.

I've found that it is only one core used at the time for restoration and it is already on 100% of usage in HTOP. In case we could use multiple cores for restoration process, it would allow me to set even higher concurrency levels and thus shorter restore times.

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

Pheggas avatar Mar 05 '25 08:03 Pheggas

When you mentioned that only one core is being used, are you referring to the backup tool or the datanode?

Additionally, are you using crossStorage? And how many segments do you currently have?

In Milvus 2.4 and later, we introduced multi-segment merge restoration, which is expected to improve performance. However, this feature is still under internal testing. Understanding your specific scenario would help us further optimize it, so we’d appreciate any additional details you can provide.

huanghaoyuanhhy avatar Mar 05 '25 10:03 huanghaoyuanhhy

Hello. Sorry for very late response.

When you mentioned that only one core is being used, are you referring to the backup tool or the datanode?

I'm refering to backup tool. I think the datanode should be alright regarding core usage.

Additionally, are you using crossStorage?

Yes, we use milvus-backup tool to backup on MinIO and also restoring from there.

And how many segments do you currently have?

In Attu UI, i see 876 segments currently from latest restoration.

In Milvus 2.4 and later, we introduced multi-segment merge restoration

We do have v2.4.10 installed. Maybe further upgrade could be useful in this topic?

It seems like i set really small parallelism number. i've set it to 6 which is apparently really small according to this comment.

Thank you for your response!

Pheggas avatar Apr 02 '25 08:04 Pheggas

Thank you for your detailed response!

Regarding the CPU usage, we believe the issue lies in the crossStorage logic not properly utilizing multiple cores during the restoration process. We will look into this and plan to optimize it soon.

Also, Milvus v2.4.10 already supports multi-segment merge restoration. Once our internal testing is complete and the feature is stable, we will expose this capability via parameters in the milvus-backup tool. At that point, upgrading the tool and specifying the corresponding flags in the command line will be sufficient to enable it.

Increasing the backup.parallelism.restoreCollection setting is more effective when you have multiple collections being restored in parallel. However, if you're restoring a single large collection, this setting will have limited effect. In that case, optimizations like multi-segment merge and improved core utilization will play a more important role.

huanghaoyuanhhy avatar Apr 15 '25 11:04 huanghaoyuanhhy

HI @huanghaoyuanhhy , How to optimize multi-segment merge and improved core utilization. Do we have configs in milvus-backup:v0.5.3 version? Please share the details.

kish5432 avatar Apr 22 '25 22:04 kish5432

@kish5432

The multi-segment merge optimization and improved core utilization are currently still under internal testing. These features are not yet available in the milvus-backup:v0.5.3 version.

We are actively working on stabilizing them, and they are expected to be released by the end of June. Once available, we will provide configuration options and usage details in the documentation.

huanghaoyuanhhy avatar Apr 23 '25 06:04 huanghaoyuanhhy