Enhancement to ERA5 Data Retrieval and Download Process

Open yndevops2 opened this issue 1 year ago • 2 comments

This update introduces an optimized approach for data retrieval and caching for ERA5 data from the Climate Data Store (CDS). Key changes include:

Caching Mechanism: Added a caching mechanism to prevent repeated downloads for identical data requests. The cache files are named based on a unique hash of the request parameters, making subsequent retrievals faster by using pre-downloaded data.
Custom Download Function: Integrated a custom download function with a progress bar to enhance user experience. The function uses chunked downloading with error handling and retry mechanisms for a robust download process.
Progress Bar: A dynamic progress bar displays the download status of multiple files, with completed files removed from the display to improve readability.

These improvements aim to make data retrieval more efficient and user-friendly.

Closes # (if applicable).

Changes proposed in this Pull Request

Checklist

[x] Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in doc.
[ ] Newly introduced dependencies are added to environment.yaml, environment_docs.yaml and setup.py (if applicable).
[ ] A note for the release notes doc/release_notes.rst of the upcoming release is included.
[ ] Unit tests for new features were added (if applicable).
[ ] I consent to the release of this PR's code under the MIT license.

Oct 28 '24 16:10 yndevops2

Hi @lkstrp, We asked @yndevops2 to help us speed up the download because with the main branch version of Atlite it was not possible to download global grid-scale multiyear time series for the capacity factors, which we needed for a project. With this upgrade that is now possible.

The caching is an optional flag anyway, but we can talk about if you'd be interested in only integrating the sped-up download. The idea for the caching was that when you change the region of interest to something smaller than what has been downloaded before, one could avoid redownloading the data.

Jan 28 '25 01:01 awongel