tag-env-sustainability icon indicating copy to clipboard operation
tag-env-sustainability copied to clipboard

[Proposal] reducing the environmental impact of our infrastructure using prediction algorithms

Open ofrcblas opened this issue 1 year ago • 5 comments

Description

I propose you to join us on a project aimed at mitigating the environmental impact of our infrastructure. In today's Kubernetes ecosystem, there are features available that allow the resizing of pods in a cluster (HPA, VPA). However, it appears that there is currently no function specifically designed to power off or power on the under bare metal servers when they are not in use. Here's the idea in more detail:

  1. Analyzing the global Cluster Traffic: Understanding the traffic patterns of the whole cluster is essential for efficient resource management. The first step would involve understanding the traffic patterns within the cluster. To achieve this, we can use machine learning algorithms to train and predict load trends.
  2. Anticipating Future Needs: Based on the predicted data, proactive measures can be taken to schedule the powering off and powering on of worker servers in the Kubernetes cluster. By densifying the load on worker servers before powering them off, the scheduler can ensure efficient resource utilization. Energy consumption algorithms can also be employed to optimize cluster tuning based on the characteristics of different servers.
  3. Ensuring Cluster Health: It is crucial to continuously monitor the overall health of our cluster. If any abnormalities or issues are detected, appropriate actions can be initiated to restore the cluster to a healthy state.

image

...

Impact

By implementing these steps, we can enhance the efficiency of our infrastructure by dynamically managing server resources and reducing unnecessary power consumption. This would contribute to minimizing the carbon footprint associated with our operations.

...

Scope

We invite you to join us in this project. Together, we can create a greener future by leveraging the power of Kubernetes and innovative technologies. Whether you're a data scientist, a developer, an infrastructure expert, or simply someone passionate about environmental stewardship, your skills and ideas are invaluable.

...

-->

ofrcblas avatar Jun 26 '23 12:06 ofrcblas

Great initiative! But perhaps the scope should be refined a bit better. The major cloud providers already offer this:

  • AWS: https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html
  • Azure: https://learn.microsoft.com/en-us/azure/aks/cluster-autoscaler
  • GCP: https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler

I've worked with those in Azure and AWS, works quite well. Keep me posted if you need some infrastructure support, happy to help out.

tkennes avatar Jun 30 '23 06:06 tkennes

Hello , thanks for your feedback and share the links :) and thanks for your support proposal!...actually the idea is not only autoscaling pods, the idea is to power off the bare metal servers that host the kubernetes workers. i've read the documentation of cluster autoscaler and it takes decision when is "too late", the proposal of this project is perform predictions and resize the cluster before the issue happen. also i think , this project is aimed to internal kubernetes cluster but can be applied to another kind of cluster...i'll post regulary that status of the project. for now i'm working on the machine learning stage :)

ofrcblas avatar Jun 30 '23 16:06 ofrcblas

Hello @tkennes , we've found a pretty algorithm that works fine with our cluster, if you want we can perform a presentation about our work. Now, we have to put in place the test lab, in order to performs measurements when the power on/off action are performed. we've also start developping the other bricks of the system (decision, scheduler?, control). Concerning the test lab, we thing that is better to have bare metal servers in order to mesure how much energy we economize when we performs start up/start down operations...do you want to help us in this phase?

ofrcblas avatar Aug 30 '23 15:08 ofrcblas

@ofrcblas can you show up at the next TAG meeting and go over this?

catblade avatar Sep 04 '24 15:09 catblade

Hello @catblade , i'm sorry for the delay of my response, when is the next meeting? I'm busy this month by i'll have more time next month.

chrystianblas avatar Sep 09 '24 08:09 chrystianblas