charts-clickhouse icon indicating copy to clipboard operation
charts-clickhouse copied to clipboard

Set pod priorities to reduce probability of ingestion being down

Open tiina303 opened this issue 4 years ago • 1 comments

Is your feature request related to a problem?

Currently all pods are equal, so if anything happens the events pod might get killed first, but there are pods we care less about e.g. beat/worker even plugins as if events and kafka are up we can just catch up ingesting events later, but if events pod is down we lose data(events).

Describe the solution you'd like

Consider adding a PodDisruptionBudget for all deployments. The PDB should be enabled if there are more then 1 replicas configured (replicascount or hpa.minpods > 1) for the component. The PDB makes the components more reliable during node maintenance, scaling and eviction.

See the official k8s documentation for more info.

Describe alternatives you've considered

Do nothing. The user must create and maintain the PDB by themselves.

Additional context

I'm not sure how pod priorities work once we have multiple replicas (likely well if we do autoscaling and poorly if we don't).

PDB can be enabled if there are more than 1 replica or with a configuration option.

tiina303 avatar Oct 14 '21 20:10 tiina303

Related to #176

guidoiaquinti avatar Dec 22 '21 10:12 guidoiaquinti