model icon indicating copy to clipboard operation
model copied to clipboard

Track energy consumption and carbon emissions

Open weiji14 opened this issue 1 year ago • 1 comments

Training foundation models can use up a lot of energy and emit signficant amounts of carbon emissions, and we should be transparent on this, since the Clay Foundation Model has an environmental focus too.

This mostly bc we depend on our cloud provider to send accurate numbers for energy and Carbon emitted, and I don't know yet to what degree these are informational or audited:

  • Is the amount of energy expended in building the model disclosed?
  • Is the amount of carbon emitted (associated with the energy used) in building the model disclosed?

Originally posted by @brunosan in https://github.com/Clay-foundation/model/issues/64#issuecomment-1837442185

Implementation

Tracking tools

There are tools for tracking energy usage and carbon emissions using tools like:

  • https://github.com/mlco2/codecarbon
  • https://github.com/fvaleye/tracarbon
  • others?

Selecting environmentally friendly cloud regions

Currently (as of Dec 2023), we are running our compute on AWS's us-east-1 (North Virginia) region which has actually has a poor carbon intensity of 378gCO₂eq/kWh (averaged over the past year, see https://app.electricitymaps.com/zone/US-MIDA-PJM). We could consider other cloud regions that have a lower carbon intensity:

  • https://aws.amazon.com/blogs/architecture/how-to-select-a-region-for-your-workload-based-on-sustainability-goals
  • https://azure.microsoft.com/en-gb/explore/global-infrastructure/sustainability
  • https://cloud.google.com/sustainability/region-carbon

While some of these cloud providers use carbon offsets, we can also make an active decision to run compute on regions with low carbon intensity.

Downstream use-cases

Training the initial Foundation Model is only the first part. We can also work to ensure that finetuning the model (for downstream tasks) can be made more energy efficient.

There are several ways to do this, but I'll just point out Parameter Efficient Fine-tuning methods, and also the work done by MIT Han Lab on Efficient AI computing.

Further reading

  • Fu, A., Hosseini, M. S., & Plataniotis, K. N. (2021). Reconsidering CO2 emissions from Computer Vision. arXiv:2104.08702 [Cs]. http://arxiv.org/abs/2104.08702
  • Lottick, K., Susai, S., Friedler, S. A., & Wilson, J. P. (2019). Energy Usage Reports: Environmental awareness as part of algorithmic accountability. arXiv:1911.08354 [Cs, Stat]. http://arxiv.org/abs/1911.08354
  • Patterson, D., Gonzalez, J., Le, Q., Liang, C., Munguia, L.-M., Rothchild, D., So, D., Texier, M., & Dean, J. (2021). Carbon Emissions and Large Neural Network Training. https://doi.org/10.48550/ARXIV.2104.10350
  • Wilkinson, R., Mleczko, M. M., Brewin, R. J. W., Gaston, K. J., Mueller, M., Shutler, J. D., Yan, X., & Anderson, K. (2024). Environmental impacts of earth observation data in the constellation and cloud computing era. Science of The Total Environment, 909, 168584. https://doi.org/10.1016/j.scitotenv.2023.168584

weiji14 avatar Dec 03 '23 21:12 weiji14