atlite icon indicating copy to clipboard operation
atlite copied to clipboard

Add heuristic for ERA5 download chunk sizes

Open euronion opened this issue 3 years ago • 3 comments

  • [ ] Think about heuristic to download in smaller/larger chunks depending on data geographical scope to download

ERA5 cutouts are currently being downloaded as time=yearly slices (after #236 on time=monthly slices) to avoid requesting too large data pieces from the ERA5 backend. Monthly retrieval could theoretically negatively affect the cutout preparation speed. We could emply a heuristic to check for the request size and then decide based on the size whether to use monthly or yearly retrieval.

See discussion here: https://github.com/PyPSA/atlite/issues/236#issuecomment-1185474539_

euronion avatar Sep 06 '22 12:09 euronion

@euronion Stumbled upon the same issue and adapted the timeframe to optmize for my usecase (small cutout but long timeframe). I added a heuristic to optimise the requests to be as large as possible while staying within the 120.000 fields limit. However, I don't know how to account for the size limit with cutouts for large areas. If someone could help me with this information, I might be able to implement this feature.

johhartm avatar Feb 21 '24 09:02 johhartm

Hi @johhartm , Thanks for the initiative. I would assume an approach of estimating the number of fields through

resolution * range latitude * range longitude * number of time steps * variables within the request

should be good for a heuristic.

Where did you get the 120.000 fields number from? It is the first time I hear about a concrete number + it seems a bit small, but that might depend on the definition of what a "field" is.

euronion avatar Feb 25 '24 22:02 euronion

I got this number from playing with creating larger requests and have them failing with the error message that the request was to large and the maximum request size is 120.000 fields. For me, the heuristic number of time steps * variables within the request worked, but only downloaded data for a pretty limited spatial frame. However, I start to think that the spatial extend does not affect the "field size", but still might to be taken into account to prevent the file size per request from getting to large. I will test this hypothesis with some larger cutouts and will get back when I have some results.

johhartm avatar Feb 26 '24 10:02 johhartm