Availability & Replication: Is failover supported or planned?
Hi team, this issue is not related to any bug or malfunction — I just wanted to ask whether Trickster supports availability and fault tolerance through process replication in any way.
From what I gathered in the documentation, it seems that Trickster doesn't currently support automatic failover in case the Trickster server crashes, which would result in the loss of in-memory data.
Is there any plan to introduce a replication mechanism for Trickster servers in the future? For example, in a setup where one node (let's say a leader) crashes, a follower could immediately take over and continue serving requests with an up-to-date cache and fully synchronized load balancer pools — or at least with some degree of stale reads, in case of eventual consistency.
I'm really interested to hear your thoughts on these features and whether you consider them essential for Trickster’s roadmap. Also, please let me know if any of these capabilities are already implemented in a way I may have missed or misunderstood.
This is not on the roadmap for the in-memory cache solution right now now, but you can have a round robin / pool of Trickster processes that use the same Redis cache instead of the in-process memory cache. There would be a minor tradeoff of performance (network latency) for redundancy.
We are pretty locked in on the 2.0 release features (and getting that release out the door finally!), but if there is more interest in cache replication, we can slate it for the 2.1 release.