incubator-horaedb-meta
incubator-horaedb-meta copied to clipboard
Basic failover capability of CeresDB cluster
Description
We implemented the cluster management capability of CeresDB with Procedure
, but Procedure
only provides shard's scheduling functionality, and it does not actively check CeresDB's cluster state. The ability to failover is still lacking.
Proposal Implement the simplest failover of CeresDB cluster mode. After CeresDB node crash, the faulty node is automatically removed and the routing relationship is adjusted. This should includes these functions:
- Check whether the node crashed based on heartbeat.
- When node is confirmed to be crashed, remove it from the metadata and transfer leader by invoke
TransferLeaderProcedure
.
Additional context