clickhouse-docs Update scaling public docs to expose behavior and thresholds

Update scaling public docs to expose behavior and thresholds

Open aashishkohli opened this issue 1 year ago • 0 comments

Currently scaling public docs don't explain autoscaling behavior fully, especially what thresholds we use to scale based on cpu and memory usage. We should update the docs to include the following details:

CPU based autoscaling: We scale up (double cpu allocation) if cpu usage crosses an upper threshold in the range of 50-75% (actual threshold depends on the size of the cluster). If cpu usage falls below ½ of the lower threshold (say 25% in case of 50% upper threshold), we recommend downscaling the service and halve cpu allocation.
Memory-based auto scaling: For memory usage, we recommend scaling to 125% of the maximum memory usage, or up to 150% if we encounter OOMs (out of memory errors).
Lookback window: We look at data over the past 30 hours to make scaling decisions.

NOTE: We are also working on improving these thresholds and scale down windows which will happen with MBB and other work in progress.

Sep 19 '24 20:09 aashishkohli

clickhouse-docs clickhouse-docs copied to clipboard

Update scaling public docs to expose behavior and thresholds

clickhouse-docs
clickhouse-docs copied to clipboard