incubator-horaedb-meta
incubator-horaedb-meta copied to clipboard
Ensure the consistency between CeresDB nodes with metadata
Description We implemented basic dynamic cluster mode in version 0.4. However, the consistency and correctness of the cluster cannot be guaranteed. We need a solution that ensures that clusters are consistent even in extreme situations, so we decided to adapt the CeresMeta implementation according to the following principles:
- Procedure for the same cluster is executed strictly serially and no concurrency is allowed.
- When a procedure is running, it is not allowed to create a new procedure.
- Before procedure running, must ensure shards version in metadata is the same as shards version in real nodes.
- If procedure running failed, it will not be rollback and no more new procedure can be submit before cluster state is reset to stable by manual.
Proposal Refactor the procedure module according to the above principles, it contains following changes:
-
ProcedureManager
needs to ensure that only one procedure can run at any one time. -
ProcedureFactory
cannot create a new procedure while a procedure is running. - Every
Procedure
should compare shard version in metadata and nodes, refused to running when they are not equal. - When a
Procedure
is running failed,ProcedureManager
cannot submit new procedure until the failed procedure is canceled by manual.
Additional context