charts-clickhouse
charts-clickhouse copied to clipboard
Set pod priorities to reduce probability of ingestion being down
Is your feature request related to a problem?
Currently all pods are equal, so if anything happens the events pod might get killed first, but there are pods we care less about e.g. beat/worker even plugins as if events and kafka are up we can just catch up ingesting events later, but if events pod is down we lose data(events).
Describe the solution you'd like
Consider adding a PodDisruptionBudget for all deployments. The PDB should be enabled if there are more then 1 replicas configured (replicascount or hpa.minpods > 1) for the component. The PDB makes the components more reliable during node maintenance, scaling and eviction.
See the official k8s documentation for more info.
Describe alternatives you've considered
Do nothing. The user must create and maintain the PDB by themselves.
Additional context
I'm not sure how pod priorities work once we have multiple replicas (likely well if we do autoscaling and poorly if we don't).
PDB can be enabled if there are more than 1 replica or with a configuration option.
Related to #176