ClimateTools.jl
ClimateTools.jl copied to clipboard
difficulty reading CF compliant files
After loading one of my files via Panoply to verify that there was nothing wrong with it (see below) I tried the model = load(gcm_files, "tasmax", poly=poly_reg)
example and got ERROR: Manually verify x/lat dimension name
.
Taking a look in the code I see that getdim_lat
relies on a list of hard coded names. I thought that the more general approach was to rely on long_name
+ units
. Not sure what to suggest -- adding to the hard coding list would be a short term fix just for me...
lon_c (720)
Datatype: Float64
Dimensions: lon_c
Attributes:
units = degrees_east
long_name = longitude

Also, the next file I am planning to present to climatetools is also CF-compliant but not on a regular lat-lon grid (see below). But I am going to wait a bit before I try that.

Thanks for the input! Indeed, this is certainly not an elegant function. From memory, this was coded for a project that involved regional climate models (your second case).
Not sure if the extraction of lon_c
based on long_name is robust though. Seems more robust to go with the detected dimensions. For instance, for a regional climate model, the dimension will not have longitude
as their dimension. They will have a longitude grid though, with the long_name
being longitude
. If I rely on detecting say longitude, we will extract the longitude grid and not the native dimension which could be meters, degrees on a stereographic grid, etc...
Open to suggestions though as hardcoding this is not a robust solution either.
Open to suggestions though as hardcoding this is not a robust solution either.
Cool. Will take a deeper look and might send PR later if I find a way to improve code
regional climate models (your second case)
Just to clarify, I use sets of these files that collectively add up to global model variables
Just to clarify, I use sets of these files that collectively add up to global model variables
You mean likes "tiles" ?
Just for reference: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#latitude-coordinate
From what I've seen with other tools, they detect dimensions using the units, which is what the CF Conventions seems to imply as well.
Thanks! I've seen that in RCMs, latitude and longitude grid have also an official standard_name
. Hence, this should be possible to discern dimensions and coordinates adequately.
I'm gonna rework this extraction part asap.
Thanks! I've seen that in RCMs, latitude and longitude grid have also an official
standard_name
. Hence, this should be possible to discern dimensions and coordinates adequately.
As highlighted by @lmilechin it is the units
attribute that should be used to identify coordinates per the CF
guidelines -- as opposed to standard_name
which is only optional and e.g. does not distinguish between different longitude conventions
I'm gonna rework this extraction part asap.
Great! Thanks
To effectively tackle this issue, having access to some problematic datasets would be welcomed.
To effectively tackle this issue, having access to some problematic datasets would be welcomed.
How about using the files I mentioned at the top of this thread?
These get generated by running 04_netcdf.ipynb
from GlobalOceanNotebooks :
outputs/nctiles-newfiles/interp/ETAN.nc
outputs/nctiles-newfiles/tiled/ETAN/ETAN.*.nc
ps. I just reran the notebook in binder & regenerated these without problem
Just to clarify, I use sets of these files that collectively add up to global model variables
You mean likes "tiles" ?
Yes -- one tile = 1 file in this example
To effectively tackle this issue, having access to some problematic datasets would be welcomed.
How about using the files I mentioned at the top of this thread?
These get generated by running
04_netcdf.ipynb
from GlobalOceanNotebooks :outputs/nctiles-newfiles/interp/ETAN.nc outputs/nctiles-newfiles/tiled/ETAN/ETAN.*.nc
ps. I just reran the notebook in binder & regenerated these without problem
Thanks, I was able to produce the files at home.
Also, re-read the thread and wanted to clarify: when I spoke about "dimension" I was mostly referring to the dimensions of the datasets, not the units/measure of the variable itself. Hence, the need to distinguish between a rotated latitude "dimension" versus the latitude grid (a variable in the dataset, not the one of the dimension) of a datasets for projected grids.
Anyway, I'll be forced to think about a more general solution to this!
edit - For example, for this dataset, there is rlat
and rlon
.
Dimensions
rlat = 412
rlon = 424
time = 2920
bnds = 2
Variables
lat (424 × 412)
Datatype: Float64
Dimensions: rlon × rlat
Attributes:
standard_name = latitude
long_name = latitude
units = degrees_north
lon (424 × 412)
Datatype: Float64
Dimensions: rlon × rlat
Attributes:
standard_name = longitude
long_name = longitude
units = degrees_east
pr (424 × 412 × 2920)
Datatype: Float32
Dimensions: rlon × rlat × time
Attributes:
grid_mapping = rotated_pole
_FillValue = 1.0e20
missing_value = 1.0e20
standard_name = precipitation_flux
long_name = Precipitation
units = kg m-2 s-1
coordinates = lon lat
cell_methods = time: mean
rlat (412)
Datatype: Float64
Dimensions: rlat
Attributes:
standard_name = grid_latitude
long_name = latitude in rotated pole grid
units = degrees
axis = Y
rlon (424)
Datatype: Float64
Dimensions: rlon
Attributes:
standard_name = grid_longitude
long_name = longitude in rotated pole grid
units = degrees
axis = X
I've sketched some code in #137
It's pretty rough right now but so far it works. Just not sure about the robustness though. Haven't had the time to test your files @gaelforget but I'm pretty sure it does not work. I'm currently testing for axis
(optional attribute in CF files) and standard_name
attributes of the dimensions. Will add long_name
later.
@gaelforget In the files produced by the Notebook, both lat_c
and lon_c
has a longitude
attribute as their long_name
.