incubator-horaedb-meta icon indicating copy to clipboard operation
incubator-horaedb-meta copied to clipboard

Improve the cleaning mechanism of ShardNode

Open ZuLiangWang opened this issue 1 year ago • 0 comments

Describe this problem We found that the failover mechanism of the HoraeDB cluster failed, and the shard was not migrated when the machine went down.

Steps to reproduce

  • Make the etcd root path configuration in HoraeDB and HoraeMeta inconsistent.
  • Shut down a HoraeDB node.

Additional Information

  • Add drop ShardNode api to deal with some extreme situations.
  • Add a new way to detect failed nodes, not only relying on etcd's lease event.
    1. Use a background thread to continuously detect failed nodes.
    2. Detect failed nodes through heartbeat.

ZuLiangWang avatar Dec 28 '23 06:12 ZuLiangWang