redis-cluster
redis-cluster copied to clipboard
Auto-start of
This is a nice clean example of redis cluster running in k8s. The one challenge is the cluster initialization and adding/removing nodes.
Is there any clean self-managing (i.e. autonomous) way to do it? Since you are using a StatefulSet, you know the names of the (initial) pods will be redis-cluster-0, redis-cluster-1, etc. You probably even could do 2 StatefulSets if you wanted guarantees as to which pods are master vs slave.
Is there no way to have redis-cluster-0 automatically come up and initialize a cluster with cluster-1 and cluster-2, or for that matter, just itself, and have cluster-1 and cluster-2 self-register to cluster-0? Same for expansion.
In a kube environment, having to do kubectl exec ... is not optimal (or recommended).
I am kind of surprised that redis doesn't have any option in the config file for saying, "here are your initial cluster peers".
You kind of wish an InitContainer could do this, but they complete before the pod containers run, so that will not help. Perhaps some extension of the image, so it spins up a child process? Or a sidecar container (although that would have a one-off and shouldn't be restarted, whereas restart policies are pod-level, not container-level)? Or a post-start lifecycle hook?
That is a good point. Indeed, redis cluster management is a bit tedious. If a self-managing way would have existed, I don't think you would even need a StatefulSet...
So if I understand you correctly you want to do something of the following. (let's assume a 1:1 master-slave ratio):
if podNr == 0 {
set up new cluster
} else if podNr % 2 == 0 {
join cluster
} else {
join cluster as slave
}
I like the idea, but it has some complications:
- We need to balance the keys across masters. This can be done when a new master is added, but I'm not sure how to automate balancing if a pod is removed (e.g. scaling down)
- As I said, this assumes a specific master-slave ratio. The downside is that you cannot easily change this after deploying, you'd have to start from scratch or make the pods much more complex.
That being said, I wrote this how-to with the assumption that you do not scale your redis cluster up and down multiple times a day. Compared to spinning up new VMs and configuring them by hand, having to perform a handful of copy-and-paste commands to scale up your cluster whenever you need seems like a fair compromise, don't you think?
redis cluster management is a bit tedious
Yeah. I had not run redis in k8s before (lots of kube, lots of redis, never together), so I had never gone down the full automation path. The assumption that a human will, at least once, manually create the cluster and add/remove nodes is not exactly cloud-friendly.
So if I understand you correctly you want to do something of the following. (let's assume a 1:1 master-slave ratio):
Actually, I would like something even a step beyond. FYI, this is how I automate etcd setup for kube (and zookeeper for Kafka).
findAllNodesInMyCluster()
if (no other nodes) {
iAmFirstMaster_Initialize()
} else {
findOtherMasters_Join()
}
We can weave master-vs-slave logic in here, or have a separate StatefulSet for slaves.
For the complications:
- "We need to balance the keys across masters" - well, the whole removal automation thing is non-trivial. I would be more than happy if we can start with a number of masters, or even a fixed number, as long as no human intervention is required. (
- "As I said, this assumes a specific master-slave ratio." Agreed. For now, though, I am happy with fixed ratios, even fixed numbers of masters and slaves, then can improve.
From my perspective, the biggest stumbling block to getting it fully automated is the initial cluster setup and join.
Compared to spinning up new VMs and configuring them by hand, having to perform a handful of copy-and-paste commands to scale up your cluster whenever you need seems like a fair compromise, don't you think?
As long as I can do them via CI (kube config files) and not logging in, sure.
I think I am going to fork your repo and make some suggestions. Good idea? You have MIT license, so I figure you don't mind?
I don't have much time to look into your comment now, sorry. But:
I think I am going to fork your repo and make some suggestions. Good idea? You have MIT license, so I figure you don't mind?
Of course! That's why it's open source after all. I'm curious what you'll come up with 😉
Ugh, I am coming up against all sorts of limitations. See my SO question here
basically:
- When a master goes down and comes back, unless there is external persistent storage volume, the master will not rejoin the cluster, but the other nodes won't know it. Cluster is broken.
- When a master goes down and comes back, unless there is external persistent storage volume, the master comes back with no data, so the slave loses all data too
- In a completely new startup, if A comes online before B (which can happen), A will start a cluster without B, but fail trying for it
The last one is solvable with timeouts (wait for all other nodes), but that is fragile
There is a "right" solution to this, which is the Kafka model (or consul's but with sharding):
- Each node starts up in cluster mode
- Each node is told the name of at least one peer, connects to it, and joins the cluster, even an existing one
- Each node detects the existence or failure of all others and adjusts accordingly (resharding)
- Each node contains its own shard data and replicas for some other nodes
Essentially, there only are masters, they are self-adjusting and self-healing and self-joining. But Redis is built in a different way entirely. Sigh.
Right, that's kind of what I also encountered. Essentially, all of your points are the main reason I used StatefulSets in the first place. I admire your courage in trying to find an all-automated solution though 😉 .
Yeah but even statefulsets don't solve it. They make the hostname consistent and if you map external persistent volumes (I prefer not to) consistently mounted, but the fundamental protocol and cluster problems continue.
Basically, redis is a great KV tech that was built pre cloud.
@sanderploegsma Does this project support k8s 1.8+?
@KeithTt not sure why you're hijacking the thread, but yeah, it should.