cfgrib
cfgrib copied to clipboard
Reading some variables in GFS files (with heightAboveGround)
Perhaps this is linked to https://github.com/ecmwf/cfgrib/issues/75 but I am using the latest version of cfgrib and I cannot read some variables in a GFS file.
I am trying to access this: https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20211014/18/atmos/gfs.t18z.pgrb2.0p25.f027
If I try to read it:
cfgrib.open_dataset('gfs.t18z.pgrb2.0p25.f027')
---------------------------------------------------------------------------
DatasetBuildError Traceback (most recent call last)
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_dataset_components(index, errors, encode_cf, squeeze, log, read_keys, time_dims, extra_coords)
640 time_dims=time_dims,
--> 641 extra_coords=extra_coords,
642 )
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_variable_components(index, encode_cf, filter_by_keys, log, errors, squeeze, read_keys, time_dims, extra_coords)
471 ) -> T.Tuple[T.Dict[str, int], Variable, T.Dict[str, Variable]]:
--> 472 data_var_attrs = enforce_unique_attributes(index, DATA_ATTRIBUTES_KEYS, filter_by_keys)
473 grid_type_keys = GRID_TYPE_MAP.get(index.getone("gridType"), [])
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in enforce_unique_attributes(index, attributes_keys, filter_by_keys)
272 fbks.append(fbk)
--> 273 raise DatasetBuildError("multiple values for key %r" % key, key, fbks)
274 if values and values[0] not in ("undef", "unknown"):
DatasetBuildError: multiple values for key 'typeOfLevel'
During handling of the above exception, another exception occurred:
DatasetBuildError Traceback (most recent call last)
<ipython-input-12-8e07db70dea6> in <module>
----> 1 cfgrib.open_dataset('gfs.t18z.pgrb2.0p25.f027')
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/xarray_store.py in open_dataset(path, **kwargs)
36 raise ValueError("only engine=='cfgrib' is supported")
37 kwargs["engine"] = "cfgrib"
---> 38 return xr.open_dataset(path, **kwargs) # type: ignore
39
40
~/miniconda3/envs/pydev/lib/python3.7/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, autoclose, concat_characters, decode_coords, engine, chunks, lock, cache, drop_variables, backend_kwargs, use_cftime, decode_timedelta)
570
571 opener = _get_backend_cls(engine)
--> 572 store = opener(filename_or_obj, **extra_kwargs, **backend_kwargs)
573
574 with close_on_error(store):
~/miniconda3/envs/pydev/lib/python3.7/site-packages/xarray/backends/cfgrib_.py in __init__(self, filename, lock, **backend_kwargs)
43 lock = ECCODES_LOCK
44 self.lock = ensure_lock(lock)
---> 45 self.ds = cfgrib.open_file(filename, **backend_kwargs)
46
47 def open_store_variable(self, name, var):
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in open_file(path, grib_errors, indexpath, filter_by_keys, read_keys, time_dims, extra_coords, **kwargs)
718 return Dataset(
719 *build_dataset_components(
--> 720 index, read_keys=read_keys, time_dims=time_dims, extra_coords=extra_coords, **kwargs
721 )
722 )
~/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py in build_dataset_components(index, errors, encode_cf, squeeze, log, read_keys, time_dims, extra_coords)
652 fbks.append(fbk)
653 error_message += "\n filter_by_keys=%r" % fbk
--> 654 raise DatasetBuildError(error_message, key, fbks)
655 short_name = data_var.attributes.get("GRIB_shortName", "paramId_%d" % param_id)
656 var_name = data_var.attributes.get("GRIB_cfVarName", "unknown")
DatasetBuildError: multiple values for unique key, try re-open the file with one of:
filter_by_keys={'typeOfLevel': 'meanSea'}
filter_by_keys={'typeOfLevel': 'hybrid'}
filter_by_keys={'typeOfLevel': 'atmosphere'}
filter_by_keys={'typeOfLevel': 'surface'}
filter_by_keys={'typeOfLevel': 'unknown'}
filter_by_keys={'typeOfLevel': 'isobaricInPa'}
filter_by_keys={'typeOfLevel': 'isobaricInhPa'}
filter_by_keys={'typeOfLevel': 'heightAboveGround'}
filter_by_keys={'typeOfLevel': 'depthBelowLandLayer'}
filter_by_keys={'typeOfLevel': 'heightAboveSea'}
filter_by_keys={'typeOfLevel': 'nominalTop'}
filter_by_keys={'typeOfLevel': 'heightAboveGroundLayer'}
filter_by_keys={'typeOfLevel': 'tropopause'}
filter_by_keys={'typeOfLevel': 'maxWind'}
filter_by_keys={'typeOfLevel': 'isothermZero'}
filter_by_keys={'typeOfLevel': 'pressureFromGroundLayer'}
filter_by_keys={'typeOfLevel': 'sigmaLayer'}
filter_by_keys={'typeOfLevel': 'sigma'}
filter_by_keys={'typeOfLevel': 'potentialVorticity'}
And If I try to: d = xr.open_dataset('gfs.t18z.pgrb2.0p25.f027', decode_cf = True, engine = 'cfgrib', backend_kwargs = {'filter_by_keys':{ ...: 'typeOfLevel': 'heightAboveGround'}} )
I get this error:
skipping variable: paramId==167 shortName='t2m' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==174096 shortName='sh2' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==168 shortName='d2m' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==260242 shortName='r2' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==260255 shortName='aptmp' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==3015 shortName='tmax' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==3016 shortName='tmin' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=2.0) skipping variable: paramId==165 shortName='u10' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=10.0) skipping variable: paramId==166 shortName='v10' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=10.0) skipping variable: paramId==131 shortName='u' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([20., 30., 40., 50., 80.])) skipping variable: paramId==132 shortName='v' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([20., 30., 40., 50., 80.])) skipping variable: paramId==130 shortName='t' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=('heightAboveGround',), data=array([ 80., 100.])) skipping variable: paramId==133 shortName='q' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=80.0) skipping variable: paramId==54 shortName='pres' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=80.0) skipping variable: paramId==228246 shortName='u100' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=100.0) skipping variable: paramId==228247 shortName='v100' Traceback (most recent call last): File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 660, in build_dataset_components dict_merge(variables, coord_vars) File "/home/felicma/miniconda3/envs/pydev/lib/python3.7/site-packages/cfgrib/dataset.py", line 593, in dict_merge "key=%r value=%r new_value=%r" % (key, master[key], value) cfgrib.dataset.DatasetBuildError: key present and new value is different: key='heightAboveGround' value=Variable(dimensions=('heightAboveGround',), data=array([1000., 4000.])) new_value=Variable(dimensions=(), data=100.0)
Hi @matteodefelice,
I've downloaded a similar GRIB file and can see what the problem is. With a grib_ls, we can see the messages that are being filtered:
grib_ls -ptypeOfLevel,level,shortName ./gfs.t18z.pgrb2.0p25.f027 | grep heightAboveGround
heightAboveGround 4000 refd
heightAboveGround 1000 refd
heightAboveGround 2 2t
heightAboveGround 2 2sh
heightAboveGround 2 2d
heightAboveGround 2 2r
heightAboveGround 2 aptmp
heightAboveGround 2 tmax
heightAboveGround 2 tmin
heightAboveGround 10 10u
heightAboveGround 10 10v
heightAboveGroundLayer 3000 hlcy
heightAboveGroundLayer 6000 ustm
heightAboveGroundLayer 6000 vstm
heightAboveGround 20 u
heightAboveGround 20 v
heightAboveGround 30 u
heightAboveGround 30 v
heightAboveGround 40 u
heightAboveGround 40 v
heightAboveGround 50 u
heightAboveGround 50 v
heightAboveGround 80 t
heightAboveGround 80 q
heightAboveGround 80 pres
heightAboveGround 80 u
heightAboveGround 80 v
heightAboveGround 100 t
heightAboveGround 100 100u
heightAboveGround 100 100v
Now, cfgrib does not like variables with different coordinates (see also #13). In this case, it will first read variable 'refd' and take the level coordinates to be 1000 and 4000. Then it hits variable '2t' with a level of 2. This will not work - see the note in the readme: https://github.com/ecmwf/cfgrib#filter-heterogeneous-grib-files
So you will need to also filter by variable to get only those that have the same levels. Before you ask, I'm not sure how to handle the awkward case of u/u10/u100 having different names and paramIds!
I do hope that this helps though.
Best regards, Iain
Thanks a lot, actually I have solved the issue doing some pre-processing with wgrib2 but I'd have loved using only python.
Ok, no problem. Metview's Python interface can also be used for pre-processing, but it requires some binaries to be installed (usually through conda). We are, however, discussing some equivalent pure Python features that could help in this sort of situation, because its interface for GRIB handling is very good.
I'll close this issue now though.
Other Option is filter by levels The problem is that there are too many levels. So you can obtain the data by filter Here is an example :
For the 2m height in Python2 looks like this:
data_2maboveground=xarray.open_dataset(local_filename,engine="cfgrib" ,filter_by_keys={'typeOfLevel': 'heightAboveGround','level':2})
For the 2m height in Python3 looks like this:
data_2maboveground=xarray.open_dataset(local_filename,engine="cfgrib" ,backend_kwargs={'filter_by_keys':{'typeOfLevel': 'heightAboveGround','level':2}})
PD. I added ('typeOfLevel':'heightAboveGround') cause I have more data in the GRIB