garm icon indicating copy to clipboard operation
garm copied to clipboard

Add adaptive idle runners

Open Fabiosilvero opened this issue 11 months ago • 0 comments

As discussed in here, here's what the feature would look like :

  • A set of new jobs come in
  • GARM reacts and creates runners
  • Jobs finish running and signal the runners can be removed
  • GARM removes the runners
  • Another set of jobs come in
  • GARM decides that there is a need to spin up idle runners (if adaptive runners are enabled) because a trend of activity has been established
  • GARM calculates a min-idle-runner setting based on trends in the paste say...X minutes/hours, or uses a default min-idle-runner set by the operator as a base for faster reaction times and adapts it later on based on traffic
  • GARM spins up a number of idle runners based on this new setting
  • GARM continuously adjusts the value based on activity
  • Entity goes idle after a while and no new job is triggered (office hours are done)
  • GARM starts an idle timeout. Once reached, runners are scaled down to 0 until activity picks up again

This will allow for eager spin-up of runners in times of activity, reducing wait times for clouds that have really slow spin-up times.

Originally posted by @gabriel-samfira in https://github.com/cloudbase/garm/discussions/336#discussioncomment-12052776

Idle runners are great for providers that don't incur extra costs for runners just sitting around (like Incus and LXD). But for public clouds, it incurs cost. An adaptive version would save money.

Fabiosilvero avatar Feb 04 '25 09:02 Fabiosilvero