pyaerocom icon indicating copy to clipboard operation
pyaerocom copied to clipboard

ReadGridded should support cropping geographically

Open thorbjoernl opened this issue 1 year ago • 4 comments

Is your feature request related to a problem? Please describe. Reading the entire griddeddata is quite memory intensive and discarding unneeded data (or ideally not reading it in in the first place) would be useful.

Describe the solution you would like to see

  • Ideally, read calls would allow passing a bounding box and data outside of this bounding box is not read.

Additional context Add any other context or screenshots about the feature request here.

thorbjoernl avatar Oct 09 '24 14:10 thorbjoernl

Note: this will not work out of the box for non-lon-lat grids. So this PR will have to be limited in scope to only implement what xarray offers (e.g., .sel on longitude and latitude)

lewisblake avatar Oct 14 '24 09:10 lewisblake

Another note: We want this to be implemented in such a way that it does not cause the data to be realized in memory earlier than the current implementation.

lewisblake avatar Oct 21 '24 09:10 lewisblake

@thorbjoernl Please specify the setup you are using and where you think extensive memory is used (in the log-files).

E.g. for collocation, we are never reading the complete gridded data, just a few time-slices of gridded data, and are then only keeping the lat/lon variables. Memory usage and time usage are tune to work well here.

heikoklein avatar Jan 08 '25 10:01 heikoklein

I originally created this issue because I ran into memory issues with some of the work I did with David (which we worked around by using the workers). This isn't critical for my purposes anymore, and if what you say is correct it may not even be that relevant to reducing memory usage. So I'd say this issue may be closed.

That being said, I know @lewisblake also wanted this functionality, so maybe ask him as well.

thorbjoernl avatar Jan 08 '25 12:01 thorbjoernl