spicedb-operator
spicedb-operator copied to clipboard
Support Autoscaling
Hello,
I had an initial looks at the operator, I wonder if there is any way to have autoscaling like https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ ?
Right now, there's not a great way to use HPA with the operator. The operator enforces a replica count and writes it every time the config changes, so you will see HPA and the operator fight to change the replicas (not continuously, just when there is a config change to the cluster, but it's still not ideal).
We definitely intend to support autoscaling with the operator, though it may or may not involve the HPA autoscaler. Depending on what we do for https://github.com/authzed/spicedb-operator/issues/82 for example, we may be able to scale up by adding nodes and filling their cache before they start responding to traffic.
Frequent scale up / scale down is probably not ideal for performance since by default we only store 1 copy of a cached item. We could bump the cache spread up, which would require more memory but may make scaling up and down quickly less disruptive (even without cache warming). This could make a lot sense as a way to deploy SpiceDB since it is frequently CPU bound.
If there's interest, we could do something short term, like adding a setting to keep the operator from writing replicas
so that other tools (HPA) can take over.
@AyWa I'm going to rename this and keep it as a tracking issue for autoscaling support - thanks for kicking off the discussion!
Hi @ecordell What's the status on this? For us it's likely making sense scaling up and down, even with temporarily reduced performance. To test this, it would be nice to try the workaround, until we have something more sophisticated.
@tarjanik Nothing currently in-flight to support this, but ideas (and PRs!) are welcome.
The brute-force idea to expose this would be:
- have a flag to disable the operator from setting replicas i.e.
.spec.unmanagedReplicas: true
- if that flag is set, don't send the replicas in the
apply
api call on the deployment - this allows you to attach whatever external HPA/VPC/etc to the deployment to control scale
But there might be some better options - I'd need to double check how HPA
is implemented; if it directly uses the scale
APIs then this should be possible just by enabling the scale api, with no code changes, and deploying an HPA object and pointing it to the SpiceDBCluster.
hi there,
do you have any update on this? Thanks