KeyError with godeeep dataset
Version Checks (indicate both or one)
-
[ ] This bug exists on the master branch of PyPSA-USA.
-
[x] This bug exists on the develop branch of PyPSA-USA.
The Issue
@asiazzzhang I am getting this error when trying to use the godeeep dataset. Currently on develop branch.
Steps To Reproduce
Config:
# docs :
scenario:
interconnect: [texas] #"usa|texas|western|eastern"
clusters: [4a]
simpl: [380]
opts: [RPS-REM-TCT-3h]
ll: [v1.5] #[v1.05]
scope: "total" # "urban", "rural", or "total"
sector: "" # G
planning_horizons: [2030] #(2018-2023, 2030, 2040, 2050)
model_topology:
transmission_network: 'reeds' # [reeds, tamu]
topological_boundaries: 'reeds_zone' # [county, reeds_zone]
interface_transmission_limits: false
include: # mixed zone types not supported
# reeds_zone: ['p8']
# reeds_state: ['CA']
reeds_ba: ['ERCO']
aggregate: # eligible keys: [reeds_zone, trans_reg]
# trans_grp: []
# reeds_zone: [p8, p9, p10, p11]
# trim:
# zone: ['CA']
# docs :
enable:
build_cutout: false
snapshots:
start: "2012-01-01"
end: "2013-01-01"
inclusive: "left"
## Handling Renewable Weather Years
renewable:
dataset: atlite # [atlite, godeeep, wus]
renewable_weather_years: [2012]
renewable_scenario_years: [2012]
renewable_scenarios: ["historical"] #["historical", "rcp45hotter", "rcp45cooler", "rcp85hotter", "rcp85cooler"]
renewable_snapshots:
start: "2012-01-01" ## change the year to match renewable_scenario_years
end: "2013-01-01"
inclusive: "left"
Error Message
[Mon Sep 29 16:26:46 2025]
rule build_renewable_profiles:
input: data/copernicus/PROBAV_LC100_global_v3.0.1_2019-nrt_Discrete-Classification-map_USA_EPSG-4326.tif, data/natura.tiff, resources/class_files/texas/Geospatial/country_shapes.geojson, resources/class_files/texas/Geospatial/offshore_shapes.geojson, repo_data/geospatial/CEC_GIS/CEC_Wind_BaseScreen_epsg3310.tif, repo_data/geospatial/CEC_GIS/CEC_Solar_BaseScreen_epsg3310.tif, repo_data/geospatial/boem_osw_planning_areas.tif, resources/class_files/texas/Geospatial/regions_onshore.geojson, cutouts/usa_era5_2012.nc
output: resources/class_files/texas/profile_onwind.nc, results/class_files/texas/land_use_availability_onwind.png
log: logs/class_files/texas/build_renewable_profile_onwind.log
jobid: 13
benchmark: benchmarks/class_files/texas/build_renewable_profiles_onwind
reason: Missing output files: resources/class_files/texas/profile_onwind.nc; Input files updated by another job: resources/class_files/texas/Geospatial/country_shapes.geojson, resources/class_files/texas/Geospatial/offshore_shapes.geojson, resources/class_files/texas/Geospatial/regions_onshore.geojson
wildcards: interconnect=texas, technology=onwind
resources: tmpdir=/var/folders/00/vqry7y9s78q3rvf_m_jmk_bc0000gn/T, mem_mb=46870, mem_mib=44699, walltime=04:00:00
INFO:__main__:using cutout "cutouts/usa_era5_2012.nc"
INFO:__main__:Calculate landuse availability...
INFO:__main__:Completed landuse availability calculation (100.93s)
INFO:atlite.convert:Convert and aggregate 'wind'.
INFO:atlite.convert:Convert and aggregate 'wind'.
INFO:__main__:Loading godeeep renewable data...
File wind_100m_historical_wind_gen_cf_2012_100m_bus_mean.nc already exists. Use force_redownload=True to redownload.
Traceback (most recent call last):
File "/Users/kamrantehranchi/Local_Documents/pypsa-usa/workflow/.snakemake/scripts/tmpa_f5my9f.build_renewable_profiles.py", line 242, in <module>
profile = profile.sel(time=renewable_sns) # filtering for appropriate time snapshot
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamrantehranchi/Local_Documents/pypsa-usa/.venv/lib/python3.11/site-packages/xarray/core/dataarray.py", line 1670, in sel
ds = self._to_temp_dataset().sel(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamrantehranchi/Local_Documents/pypsa-usa/.venv/lib/python3.11/site-packages/xarray/core/dataset.py", line 3184, in sel
query_results = map_index_queries(
^^^^^^^^^^^^^^^^^^
File "/Users/kamrantehranchi/Local_Documents/pypsa-usa/.venv/lib/python3.11/site-packages/xarray/core/indexing.py", line 193, in map_index_queries
results.append(index.sel(labels, **options))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamrantehranchi/Local_Documents/pypsa-usa/.venv/lib/python3.11/site-packages/xarray/core/indexes.py", line 801, in sel
raise KeyError(f"not all values found in index {coord_name!r}")
KeyError: "not all values found in index 'time'"
Anything else?
No response
Yikes, I'll take a look at this!
This issue is because 2012 was a leap year, so it was supposed to have 8784 hours, but the GODEEEP dataset was created using a regular year template (8760 hrs). So it included Feb 29, 2012, but excluded Dec 31, 2012 as a result. So that's why the times are misaligned and it errored out.
Add the following to the build_renewable_profiles.py script before the profile = profile.sel(time=renewable_sns) line:
import pandas as pd
time_index = pd.DatetimeIndex(profile.time.values)
def shift_leap_year(dt):
if dt >= pd.Timestamp(f'{year}-02-29'):
return dt + pd.Timedelta(days=1)
return dt
new_time_index = pd.DatetimeIndex([shift_leap_year(t) for t in time_index])
profile = profile.assign_coords(time=new_time_index)
This should fix the issue for now, I'll commit these changes soon!