🚀[FEA]: Example demonstrating how to create local dataset for inference
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Medium
Please provide a clear description of problem you would like to solve.
We need an example that demonstrates how to use the data sources to dump into a zarr store, then use that zarr store to run inference.
This type of workflow is particular relevant for more evaluation inference workflows.
Hi @NickGeneva ,
+1 on this one. I am currently struggling to understand how can I use locally stored zarr / netcdf for initialisation of the forecasts. This example would be of high relevance for running inference on GPUs that do not have an explicit access to internet (e.g. supercomputers).
I'd be happy to work on that example in case you are looking for support.
Thanks !
--- More (optional) context and questions from my use case ---
- How can I reuse the downloaded data (placed in
.cache/eath2studio/<api_name>) without pinging the API again, e.g. without an internet connexion ? It seems that even the persistent cache management somehow needs to ping the APIs, even when the files are already downloaded in the cache.
I am struggling to understand how locally stored files can be used directly as inputs to the model. I feel that there is a solution with classes in earth2studio/data/xr.py, such as DatasetFile.
- How can classes such as DatasetFile be used ? What are the requirements for the files (variables names, dims, ...) for a seamless implementation of DatasetFile?