satpy icon indicating copy to clipboard operation
satpy copied to clipboard

Make NetCDF file cache handling compatible with dask distributed

Open gerritholl opened this issue 1 month ago • 2 comments

This PR makes file cache handling in the NetCDF4FileHandler compatible with dask distributed. It adds a utility function in satpy.readers.utils called get_distributed_friendly_dask_array, which can be used to produce a dask.array from a netCDF4 variable that can be used in an xarray, but dask graphs remain picklable and thus computable when including this one. This utility function is now used in NetCDF4FileHandler, which replaces homegrown file handle caching by caching using xarray.backends.CachingFileManager, which is needed to implement the aforementioned utility function.

  • [x] Closes #2815
  • [x] Tests added
  • [x] Fully documented

gerritholl avatar Jun 14 '24 08:06 gerritholl