indok-web
indok-web copied to clipboard
Explore auto scaling ECS based on events
To automatically handle peak loads https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/autoscaling.html https://docs.aws.amazon.com/autoscaling/ec2/userguide/schedule_time.html
Proposed changes
Create a model that serves as an guideline for how we should scale our ECS clusters depending on a set of variables. For example for events:
scaling_variable = "available_slots"
scaling_factor = 0.2 # spin up a task for every 0.2 of scaling_variable
scheduled_time = "signup_open_date"
Then, using this model, we use boto3 to schedule tasks according to the above variables for the scheduled_time. In doing so, we automatically prepare our servers for estimated peak load and improve the overall user experience. Furthermore, by having these numbers set dynamically, we can alter them without redeployment if we observe that they are either too lenient or too aggressive.
Metrics
data:image/s3,"s3://crabby-images/2f117/2f117b3b59b1577efbe26653beafcc8945c00101" alt="Screen Shot 2022-02-17 at 13 13 48"
data:image/s3,"s3://crabby-images/c8ec5/c8ec53e028ddae462f712d9bc27261f2992f9030" alt="Screen Shot 2022-02-17 at 13 13 28"
The peak at 11:00 is Winter Games, roughly 100 simultaneous sign ups starting at 8 backend tasks, leading to significant delays and a poor experience and roughly 100% dropped new connections at peak load.
The peak at 12:00 is an anniversary event, with roughly 100 simultaneous sign ups starting at 32 backend tasks, with a much better user experience, and no registered dropped connections.
DB burst scaling does not appear to be a bottleneck.