atlite
atlite copied to clipboard
Atlite ESGF interface for downloading and preparing CMIP6 data
Change proposed in this Pull Request
Add a interface in atlite to the ESGF CMIP database for downloading and preparing CORDEX and CMIP6 data.
Description
An interface in atlite for working with Climate model output have been developed. There is an example on how to use this interface available in examples/cmip_interface_example.ipynb. The search parameters for the ESGF database can be specified as either as dictionary when setting up the cutout or in the atlite/datasets/cmip.ymal file. Determining the search parameters have to be done manually by searching the ESGF database.
Variables required by atlite are:
- rsds, surface downwelling radiation shortwave
- rsus, surface upwelling radiation shortwave
- sfcWind, wind speed 10m
- mrro ,runoff
- tas, surface temperature
This similar to the variables required by the old CORDEX interface, however surface roughness doesn't seem to be available from CMIP. Based on the provided search parameters, it uses the pyesf-search python api to find matching results, if there are more than one result it will take the most resent result. Then the OPeNDAP urls for that result are obtained and which can be loaded lazily using xarray. This means that the data can be subset according to the cutout and the download and computation will be triggered by cutout.prepare(). Be aware that some models and dataservers doesn't provide OPeNDAP urls, which means that you might have to try different ensamble to find a model that has. The current example uses data from the EC-Earth3 model. There is also a possibility to download netCDF files with 1 year of data individually, however this haven't been implemented in atlite. That might be more robust, but atleast so far the OPeNDAP interface in xarray have been working flawlessly.
Caveats:
The highest temporal resolution that are available CMIP is 3hr, however some models only have some of variables required by atlite at 3hr resolution while others are at 6hr resolution. CMIP6 also have quite coarse resolution ~ 100km, CORDEX has higher resolution. The surface roughness is not available in CMIP, currently averaged roughness is taken from ERA5.
Motivation and Context
Explore influence of future climate change on energy systems. Related issue #59
How Has This Been Tested?
The functionally have been tested for calculating wind and pv capacities. Tested with python > 3.9.
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist
- [x] I tested my contribution locally and it seems to work fine.
- [x] I locally ran
pytestinside the repository and no unexpected problems came up. - [x] I have adjusted the docstrings in the code appropriately.
- [ ] I have documented the effects of my code changes in the documentation
doc/. - [x] I have added newly introduced dependencies to
environment.yamlfile. - [ ] I have added a note to release notes
doc/release_notes.rst. - [ ] I have used
pre-commit run --allto lint/format/check my contribution
Reuse compliance requires a comment with the license at the beginning of each new file (can just be copied from the other .py/.yaml files) Could you also merge the up-to-date master so that tests run through?
@FabianHofmann Something I would like your thoughts on. As I mentioned the surface roughness isn't available from CMIP, I have yet to address this issue in my code. My idea is to just have a keyword argument path with some external roughness dataset, however I could also make atlite prepare a static roughness dataset from ERA5? Atleast requiring the path to roughness dataset to be provided during the creation of the cutout would avoid any confusion of the data source.
@Ovewh for retrieving features from other sources one has to add the module to the cutout. But: The surface roughness is only used for extrapolating the wind speed to the turbine hub height. But the ESGF can retrieve wind speed at arbitrary heights right?
@Ovewh for retrieving features from other sources one has to add the module to the cutout. But: The surface roughness is only used for extrapolating the wind speed to the turbine hub height. But the ESGF can retrieve wind speed at arbitrary heights right?
No, only a few models provide wind speed at 100m, most models provide only the surface wind speed. So the windspeed has to be extrapolated.
No, only a few models provide wind speed at 100m, most models provide only the surface wind speed. So the windspeed has to be extrapolated.
Okay then the roughness data has to come from the era5 dataset. atlite allows to mix datasources. So the best way would be to retrieve all variable from ESGF and fill up with era5 data which is principally done with
cutout = atlite.Cutout('my_cutout', module=['esgf', 'era5'], time=....)
Then it will retrieve all availabe features from esgf and the fill up missing variables (in that case the roughness data) from era5. Could you try that out?
No, only a few models provide wind speed at 100m, most models provide only the surface wind speed. So the windspeed has to be extrapolated.
Okay then the roughness data has to come from the era5 dataset. atlite allows to mix datasources. So the best way would be to retrieve all variable from ESGF and fill up with era5 data which is principally done with
cutout = atlite.Cutout('my_cutout', module=['esgf', 'era5'], time=....)Then it will retrieve all availabe features from esgf and the fill up missing variables (in that case the roughness data) from era5. Could you try that out?
@FabianHofmann Yes ,so the issue is that CMIP contains future climate projections, and ERA5 is a reanalysis. It only makes sense to take the averaged roughness from ERA5, either based on one year or a single month. I did some sensitivity tests calculating capacity factors using constant and forecasted roughness for ERA5. There where only a slight difference in the offshore capacities.
Let's also have a look at https://py-cordex.readthedocs.io/en/stable/index.html
@FabianHofmann It doesn't look like py-cordex have an interface for downloading data, but I did not work with the CORDEX data.
The first attempt on creating a CMIP interface I made turned out to be bit of a dead end. Integrating downloading of the CMIP data directly in atlite did not work out that well, since the CMIP datafiles are formated slightly different from model to model (e.g. some models provide yearly files or 10 years in one file, and then the models also use different calendars). It is probably simpler and more robust to make a very general interface for sideloading locally stored climate and weather data into atlite. Then it would be up to the user to preprocess the data into a format that atlite can understand.
It is probably simpler and more robust to make a very general interface for sideloading locally stored climate and weather data into atlite. Then it would be up to the user to preprocess the data into a format that atlite can understand.
Interesting! This would be similar to what we have for the SARAH2 dataset (cutout(...) get's called with an additional argument sarah_dir pointing to a local directory containing the manually downlaoded SARAH2 data due to a lack of API.
(Just a comment)
Outsider question (I'm not familiar with CMIP/COREDEX datasets): Is there like a central repository from which one can manually downloaded the data?
Interesting! This would be similar to what we have for the SARAH2 dataset (cutout(...) get's called with an additional argument sarah_dir pointing to a local directory containing the manually downlaoded SARAH2 data due to a lack of API.
Yes, that's my idea, though perhaps even more general, instead of path, it would be a xarray.Dataset.
Outsider question (I'm not familiar with CMIP/COREDEX datasets): Is there like a central repository from which one can manually downloaded the data?
Yes, all the CMIP6/CORDEX data is stored at ESGF data nodes. It provides different ways of downloading the data e..g. OPeNDAP and wget scripts.