polartoolkit icon indicating copy to clipboard operation
polartoolkit copied to clipboard

Add option to stream data from cloud instead of downloading locally

Open mdtanker opened this issue 4 months ago • 0 comments

Currently all of the datasets available withing the fetch module are downloaded and stored on the users local computer using Pooch. As some of the these datasets are large, and as polartoolkit begins to be incorporated into cloud-computing services such as CryoCloud, it would be ideal for users to be able to stream cloud-optimized datasets, instead of having to download the entire datasets.

For now, this is intended just for raster datasets, which are typically supplied as NetCDF (.nc) or GeoTIFF (.tif) files.

It seems that the .zarr file format may be the best file type to work with cloud storage (https://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cloud).

It seems like Pangeo-Forge is perfectly set up for this, if I understand it correctly.

I will experiment with creating a Pangeo-Forge recipe for Bedmap2 and report back here with how it went.

Note: This extension seems to allow access to EarthData.

Links:

mdtanker avatar Feb 18 '24 22:02 mdtanker