wave energy converters and new wind and wave data modules
Closes # (if applicable).
Changes proposed in this Pull Request
Checklist
- [x] Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in
doc. - [x] Unit tests for new features were added (if applicable).
- [ ] Newly introduced dependencies are added to
environment.yaml,environment_docs.yamlandsetup.py(if applicable). - [ ] A note for the release notes
doc/release_notes.rstof the upcoming release is included. - [x] I consent to the release of this PR's code under the MIT license.
Thanks a lot for this nice feature!
Can you let us know when you are ready for us to review it?
It would also be nice to have a short example, e.g. a ipynb to include into the documentation.
@brynpickering Can I ask you to review it if you have capacity? Thanks!
Thank you!
I think we can review it right away, and I can prepare some files for documentation. Apologies for being very new to github procedures. I am slowly starting to get the hang of it.
[...] Apologies for being very new to github procedures. I am slowly starting to get the hang of it.
No worries, good that you mention it! We'll help you get settled in and don't hesitate to ask questions if something is unclear or to ping us (e.g. using @euronion or @brynpickering).
Hello @brynpickering, I have made all of the changes locally, should I commit changes in the forked branch or is there another way to continue?
Hello @brynpickering, I have made all of the changes locally, should I commit changes in the forked branch or is there another way to continue?
Yes, in the forked branch. You should be able to just always work in the forked branch and push to your own repository ("origin") whenever you make changes. Those changes will then be made visible in this PR
@euronion not sure what the system should be for accessing the data. This PR works on the basis that the user downloads the data themselves. The data is available via OpenDAP so it would be feasible to query them directly with the appropriate lat/lon/time attrs, e.g.:
import xarray as xr
years = [2000]
months=[1, 2, 3, 4, 5, 6]
remote_data = xr.open_mfdataset(
[f"https://opendap.4tu.nl/thredds/dodsC/data2/djht/f359cd0f-d135-416c-9118-e79dccba57b9/1/{year}/TU-MREL_EU_ATL-2M_{year}{month:02}.nc?hs,latitude,longitude" for year in years for month in months],
engine="netcdf4"
)
# `time` coord seems to be corrupted in the data source, so we have to translate integers to datetime locally
remote_data.coords["time"] = pd.date_range(f"{years[0]}-{months[0]:02}", f"{years[-1]}-{months[-1]:02}", freq="1H", inclusive="left")
remote_data.sel(latitude=y, longitude=x, time=time)
@euronion not sure what the system should be for accessing the data. This PR works on the basis that the user downloads the data themselves. The data is available via OpenDAP so it would be feasible to query them directly with the appropriate lat/lon/time attrs, e.g.:
import xarray as xr years = [2000] months=[1, 2, 3, 4, 5, 6] remote_data = xr.open_mfdataset( [f"https://opendap.4tu.nl/thredds/dodsC/data2/djht/f359cd0f-d135-416c-9118-e79dccba57b9/1/{year}/TU-MREL_EU_ATL-2M_{year}{month:02}.nc?hs,latitude,longitude" for year in years for month in months], engine="netcdf4" ) # `time` coord seems to be corrupted in the data source, so we have to translate integers to datetime locally remote_data.coords["time"] = pd.date_range(f"{years[0]}-{months[0]:02}", f"{years[-1]}-{months[-1]:02}", freq="1H", inclusive="left") remote_data.sel(latitude=y, longitude=x, time=time)
This data get's read into a Cutout, right?
In an ideal world we support both:
- Automatically retrieving of the data
- Building from a local downloaded file
If it is easily possible, then yes, please implement automatic retrieval as well. Since you already have building from a local file, you could implement it by downloading to a local temporary file and then pass that to your existing function.
One question: The naming "wecgenerator" strikes me a bit odd. I haven't looked at it in detail, although my understanding is that in this PR the conversion is from wave energy to electricity. Wouldn't then the term "wec" or "waveenergyconverter" be more appropriate, as not only the "generator", but the whole system is modelled?
@lmezilis I notice in the ECHOWAVE data that the variable tp doesn't exist; only t01 and t02 exist linked to wave period data (both being mean wave periods). How did you get tp? From a different version of this dataset?
Building from a local downloaded file
@euronion is this how it is done for other datasets?
Building from a local downloaded file
@euronion is this how it is done for other datasets?
- For SARAH2/SARAH3, yes. Initially there was not API that's why we implemented it that way. Today there is an API #447 .
- For ERA5, yes, internally it downloads and creates files, then builds the cutout from those files.
the variable
tpdoesn't exist; onlyt01andt02exist linked to wave period data (both being mean wave periods). How did you gettp? From a different version of this dataset?
@brynpickering I must have changed the variables to be more convenient to work back in the day and completely forgot their original names. t01 is the one that should be used. That dataset has a lot of different types of ocean variables.
In the meantime, it would be great to add some documentation.
of course, sorry for the delays, I am working on this, I have my documentation ready but trying to make it apealing and clear. probably be ready by tomorrow
compare the results for a specific gridcell?
I assume between ERA5 and ECHOWAVE, correct?
One question: The naming "wecgenerator" strikes me a bit odd.
@euronion you are right, a simple wec should do it. I will update this. Also in pypsa-eur I used the term wec_type as a generator index in build_renewables. I notice that wind turbines are called turbines. Should I just type wec there too?
One question: The naming "wecgenerator" strikes me a bit odd.
@euronion you are right, a simple
wecshould do it. I will update this. Also in pypsa-eur I used the termwec_typeas a generator index inbuild_renewables. I notice thatwind turbinesare calledturbines. Should I just typewecthere too?
Yes, wec or even just converter instead of wec_type in this case is fine, because the context and the hierarchy in the config making it clear what it is about (solar -> panel, wind -> turbine, wave -> wec/converter).
I have tried to implement the new convert functions and I think they work fine. However it seems like there are issues with the downloaded data, when I try to create the cutout, probably a problem of the OPENDAP server. I have an invalid ID error:
Exception ignored in: <function CachingFileManager.__del__ at 0x000001F1C57E0E00>
Traceback (most recent call last):
File "c:\Users\thira\anaconda3\envs\pypsa-eur\Lib\site-packages\xarray\backends\file_manager.py", line 250, in __del__
self.close(needs_lock=False)
File "c:\Users\thira\anaconda3\envs\pypsa-eur\Lib\site-packages\xarray\backends\file_manager.py", line 234, in close
file.close()
File "src/netCDF4/_netCDF4.pyx", line 2669, in netCDF4._netCDF4.Dataset.close
File "src/netCDF4/_netCDF4.pyx", line 2636, in netCDF4._netCDF4.Dataset._close
File "src/netCDF4/_netCDF4.pyx", line 2164, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: Not a valid ID
C:\Users\thira\Desktop\atlite-mrel\atlite\data.py:249: UserWarning: The specified chunks separate the stored chunks along dimension "time" starting at index 100. This could degrade performance. Instead, consider rechunking after loading.
cutout.data = xr.open_dataset(cutout.path, chunks=cutout.chunks)
Even with this error the cutout is seemingly finnished, but something is not passed correctly, as when I slice the cutout like so time=slice("2018-01-01", "2018-01-08"), I still get the entire month of cutout, let's say TU-MREL_EU_ATL-2M_201801.nc.
When I set it until February (time=slice("2018-01-01", "2018-02-08")) then the cutout completes with the last timestamps having empty grids. I don't know if this is a cutout.py related issue but I cannot create a cutout smaller than 1 month.
Apart from that the new code looks to be working.
I have also tried to automate the cutout process in case the data are not already downloaded. It seems that there are permision issues there, as I can remotely load the dataset but i cannot load any of the variables, it has to be downloaded. So what I did is a function to create the urls and a another one to download and merge them.
I have to be honest it doesn't look ideal, and also I am not sure how to use the temp directories to save the downloaded files and load them from there. I will upload an example without this feature.
@lmezilis could you point me to the source of the datasets you're using? I used ones directly from the TUDelft OpenDAP but they may be slightly different to the one you have already downloaded.
I am working with the source that you mentioned above: OPeNDAP. The same dataset was used for our calculations.
You can see the code below for how I obtained the urls after the cutout parameters were set:
time_index = cutout.coords["time"].to_index()
urls = []
for year in time_index.year.unique():
year_times = time_index[time_index.year == year]
months = year_times.month.unique()
# Limit months in the final year
if year == time_index[-1].year:
last_month = time_index[-1].month
months = months[months <= last_month]
for month in months:
url = (
"https://opendap.4tu.nl/thredds/dodsC/data2/djht/f359cd0f-d135-416c-9118-e79dccba57b9/1/"
f"{year}/TU-MREL_EU_ATL-2M_{year}{month:02}.nc",
)
urls.append((year, month, url))
I made all of these similar commits because there are some things that I need to change in the syntax, but the pre-commit auto fix changes them back. I don't know why.
@lmezilis no worries. We'll probably squash all these commits when we merge it in, so it'll all be cleaned up.
You could install pre-commit locally so the fixes are managed locally. In your atlite working environment call pre-commit install. Then pre-commit will fix things before you try to commit.
RE allowing data downloads, I've found that the OpenDAP fails when trying to download more than a few MB of data at once (DAP failure or Authorization failure). Not sure if you get this issue @lmezilis but it seems to me that it's too volatile to rely upon as a way to access the data.
Yes I had the same problem the last few days even though last week I could complete it. I say for now lets keep it manual, and I will contact the server to see what we can do.