gorse
gorse copied to clipboard
High availibility support
Is your feature request related to a problem? Please describe. I want to run Gorse in a high availability setup. Now it is not possible to have multiple master nodes. If the master node is done, there is no backup.
Describe the solution you'd like I want Gorse to support master-master replication. Multiple masters should be able to communicate with each other to orchestrate tasks.
Describe alternatives you've considered There is an option to have duplicate Gorse with an identical configuration that is used as a backup if the first one is down. But these 2 Gorses would calculate recommendations redundantly because they know nothing about each other.
Additional context We can discuss implementation details if needed.
We plan to support multiple master nodes in 0.4.x. However, it's a long way to go. There are several things to be done:
- [ ] Use etcd for membership management.
- [ ] Implement distributed training.
- [ ] Implement distributed schedualing.