Add support for Anemoi Datasets
ECMWF has now started doing an idea similar to the one in #3 Blog post: https://www.ecmwf.int/en/about/media-centre/aifs-blog/2024/data-driven-regional-modelling
Detailed Description
https://anemoi-datasets.readthedocs.io/en/latest/using/combining.html
Context
Using a high-resolution subset within a global model is quite interesting to get the higher resolution required. Anemoi datasets works to do that as well, making the cutouts.
Possible Implementation
Import the package and use that? Copy over the cutout functionality? As otherwise, its a light wrapper around Zarr stores, which we already support.
Hi @jacobbieker, could you give me more details about this?
Hi, we just want to make it so we can train with the anemoi datasets package. It has some nice functionality, so would want to add it to the different ways of loading weather datasets. Probably as another option in the data/ module here, somewhat like the IFS dataloader.
Have you a specific weather dataset in mind to start with?
The easiest would probably be either WeatherBench, or get it working with the ICON archive that we've made on Hugging Face.
With the ICON EU and ICON Global datasets on HF, we could get the loading of a low resolution global model + high resolution regional model working together, but would take more work to get up and running with this.
First step would probably try WeatherBench, and just get it to output training examples
hey, is this issue still being worked on?
I'm not sure, but if you want to tackle it, that would be great!
alrighty, taking a jab at it :D
@jacobbieker could i work on this?
@FilippoContessa yeah! Go right ahead
I'll start from this issue. If I understood correctly, we aim to integrate the anemoi datasets by creating another dataloader specifically for this type of ds.
Is that right?
Yes, so we want to be ideally able to load any dataset in the anemoi datasets
Hey, if this issue is still open I would love to contribute to it. @jacobbieker
Hey, yeah, I believe it is!
Hi @jacobbieker,
I’m new to open source and interested in contributing to the Anemoi Datasets issue (#102). The task looks a bit intimidating at first, so I’d really appreciate any guidance on where to start or if there’s a smaller part I can focus on initially.
Thanks a lot for your help!
For this, you would want to just make some sort of connector between what is outputted by the anemoi open_datasets or open_lam and what the graph networks expect here. So this should be a fairly small wrapper around the output from opening a dataset in anemoi.
Thank you so much @jacobbieker, and apologies for spamming you across multiple issues 😅 If it’s okay, I’d like to take up this issue and gradually work my way through others as I get more comfortable. Would that be alright?
Yeah, definitely!
thanks alot!
@jacobbieker I have submitted a PR, awaiting review. Kindly let me know if any further changes are required.