Netcdf files with unexpected order of time, easting and northing coordinates.
Hi all, I would like to open a nc file and process it with terra package (see header below) . The terra::rast function (default settings) does not recognize the easting and northing coordinates but it takes "time coordinate as one of the the two. I see generally that the coordinate order is expected to be "time, easting, northing" and not "easting ,northing ,time" . Are there some specific settings in order terra::rast can read correctly the spatiotemporal gridded variables? Thank you for your attention and suggestion.
Best Emanuele @ecor
$ ncdump -h METEO_DATA_202003120000.NC
netcdf METEO_DATA_202003120000 {
dimensions:
time = 24 ;
northing = 180 ;
easting = 185 ;
variables:
double time(time) ;
time:standard_name = "time" ;
time:units = "days since 2020-03-12T00:00:00+01:00" ;
double northing(northing) ;
northing:units = "m" ;
northing:long_name = "Northing" ;
northing:coordinate\ system = "CH1903 / LV03 (EPSG:21781)" ;
double easting(easting) ;
easting:units = "m" ;
easting:long_name = "Easting" ;
easting:coordinate\ system = "CH1903 / LV03 (EPSG:21781)" ;
double lwrc(easting, northing, time) ;
lwrc:_FillValue = -9999. ;
lwrc:standard_name = "downwelling_longwave_flux_in_air" ;
lwrc:long_name = "Incoming longwave radiation" ;
lwrc:coverage_content_type = "modelResult" ;
lwrc:units = "W m-2" ;
double sdri(easting, northing, time) ;
sdri:_FillValue = -9999. ;
sdri:standard_name = "direct_downwelling_shortwave_flux_in_air" ;
sdri:long_name = "Incoming direct shortwave radiation on inclined surface" ;
sdri:coverage_content_type = "modelResult" ;
sdri:units = "W m-2" ;
double sdfd(easting, northing, time) ;
sdfd:_FillValue = -9999. ;
sdfd:standard_name = "diffuse_downwelling_shortwave_flux_in_air" ;
sdfd:long_name = "Incoming diffuse shortwave radiation" ;
sdfd:coverage_content_type = "modelResult" ;
sdfd:units = "W m-2" ;
double tais(easting, northing, time) ;
tais:_FillValue = -9999. ;
tais:units = "K" ;
tais:standard_name = "air_temperature" ;
tais:long_name = "Air temperature at 10 m above ground" ;
tais:coverage_content_type = "modelResult" ;
double rhus(easting, northing, time) ;
rhus:_FillValue = -9999. ;
rhus:units = "%" ;
rhus:standard_name = "relative_humidity" ;
rhus:long_name = "Relative humidity at 10 m above ground" ;
rhus:coverage_content_type = "modelResult" ;
double pail(easting, northing, time) ;
pail:_FillValue = -9999. ;
pail:units = "Pa" ;
pail:standard_name = "air_pressure" ;
pail:long_name = "Air pressure" ;
pail:coverage_content_type = "modelResult" ;
double wnsd(easting, northing, time) ;
wnsd:_FillValue = -9999. ;
wnsd:units = "m/s" ;
wnsd:standard_name = "wind_speed" ;
wnsd:long_name = "Wind speed at 10 m above ground" ;
wnsd:coverage_content_type = "modelResult" ;
double wndd(easting, northing, time) ;
wndd:_FillValue = -9999. ;
wndd:units = "deg" ;
wndd:standard_name = "wind_from_direction" ;
wndd:long_name = "Wind direction at 10 m above ground" ;
wndd:coverage_content_type = "modelResult" ;
double proi(easting, northing, time) ;
proi:_FillValue = -9999. ;
proi:units = "mm/hr" ;
proi:standard_name = "precipitation_flux" ;
proi:long_name = "Precipitation" ;
proi:coverage_content_type = "modelResult" ;
// global attributes:
:title = "High-resolution hydrometeorological data for the Dischma valley and surrounding areas in eastern Switzerland" ;
:summary = "This dataset contains high-resolution meteorological input data useful for developing and testing physics-based snow and landsurface models. The dataset contains hourly data and has a spatial resolution of 100m and an extent of 18500 by 18000m." ;
:keywords = "downwelling_longwave_flux_in_air; direct_downwelling_shortwave_flux_in_air; diffuse_downwelling_shortwave_flux_in_air; air_temperature; relative_humidity; air_pressure; wind_speed; wind_from_direction; precipitation_flux" ;
:conventions = "ACDD 1-3; CF-1.11" ;
:creator_name = "Jan Magnusson; Tobias Jonas" ;
:creator_email = "[email protected]; [email protected]" ;
:creator_url = "https://orcid.org/0000-0003-0257-1862; https://orcid.org/0000-0003-0386-8676" ;
:institution = "WSL Institute for Snow and Avalanche Research SLF" ;
:project = "International Network for Alpine Research Catchment Hydrology" ;
:geospatial_lat_min = 46.6734987688503 ;
:geospatial_lat_max = 46.8353396574981 ;
:geospatial_lon_min = 9.74259474964444 ;
:geospatial_lon_max = 9.98461064390458 ;
:date_created = "2024-08-08T23:22:05" ;
:history = "File created at 2024-08-08T23:22:05 using Matlab by Jan Magnusson" ;
}
(r-reticulate)
I may have had a similar problem at one point. rast doesn't recognise the time dimension very well, but rather is treated as longitude or latitude. At the time, my question on stack overflow was deleted by the forum administrator before it could be answered ....
One way I can think of is to extract the values, rotate the dimension, and then write the attributes in? (although I didn't try that subsequently)
Thanks, my question is about if rast is flexible and can open netcdf file in a non-standard format . In this case it takes "time"coordinate as latitude beacause "time" dimension is in the position expected for "latitude" dimension and so on. Otherwise variables can be directly extracted with "ncdf4" as array. Thanks @ecor
I'm not sure I understood your point and question correctly, but I'll leave what I was thinking at the time. A 3D array of m*n*t (time) , I had problems recognising it as (m*t(time)*n),at the time I sought a way to expect rast to be able to manually define a certain dimension to be ‘time’ itself, but the question was removed.
Although it is offensive to mention other packs under terra, please forgive me. I ended up using xarray (a python package) for my processing.
The ncdump listing clearly shows the nine data variables in the file and the three axes that they all use. So far, so good. The three axes (a.k.a. coordinate variable, composed of a dimension and a variable) are incompletely specified, however. The CF Metadata Conventions, which this file claims to adhere to, gives three options to identify how the coordinate variables are oriented in space, for X-Y: (1) the use of default units of "degrees_east" for longitude and "degrees_north" for latitude, which does not apply here because the data is in a projected CRS; (2) the use of a "standard_name" attribute, in this case with values "projected_x_coordinate" and "projected_y_coordinate"; or (3) the "axis" attribute with values "X" and "Y". The file provides neither of these mechanisms. The use of the "coordinate system" attribute is not standard and thus not recognised by a netCDF reader that applies the CF Metadata Conventions (and the name of the attribute is illegal too, having a space in it).
GDAL, which the terra package uses for reading netCDF files, understandably has trouble reading this file. It makes a number of guesses which are, unfortunately, not yielding the desired result. I downloaded the data you used (a link would have been useful, as well as notice on the web site that this was for a mere 31GB of data...) and this is what I get with terra::rast():
class : SpatRaster
dimensions : 180, 24, 1665 (nrow, ncol, nlyr)
resolution : 0.04166667, 100 (x, y)
extent : -0.02083333, 0.9791667, 172000, 190000 (xmin, xmax, ymin, ymax)
coord. ref. :
sources : METEO_DATA_202003120000.NC:lwrc (185 layers)
METEO_DATA_202003120000.NC:sdri (185 layers)
METEO_DATA_202003120000.NC:sdfd (185 layers)
... and 6 more sources
varnames : lwrc (Incoming longwave radiation)
sdri (Incoming direct shortwave radiation on inclined surface)
sdfd (Incoming diffuse shortwave radiation)
...
names : lwrc_~76050, lwrc_~76150, lwrc_~76250, lwrc_~76350, lwrc_~76450, lwrc_~76550, ...
unit : W m-2, W m-2, W m-2, W m-2, W m-2, W m-2, ...
depth : 776050 to 794450 (m [m]: 185 steps)
The number of rows (northing) is correct but the columns are in fact the "time" axis and the number of layers is a merger of the easting axis and the 9 data variables in the file - the layer names show the result. The extent is also wrong and there is no CRS.
You can read the file more "cleanly" with package ncdfCF:
library(ncdfCF)
(ds <- open_ncdf("~/METEO_DATA_202003120000.NC"))
#> <Dataset> METEO_DATA_202003120000.NC
#> Resource : ~/METEO_DATA_202003120000.NC
#> Format : classic4
#> Collection : Generic netCDF data
#> Conventions: (not indicated)
#> Keep open : FALSE
#>
#> Variables:
#> name long_name units data_type axes
#> lwrc Incoming longwave radiation W m-2 NC_DOUBLE time, northing, easting
#> sdri Incoming direct shortwave radiation on inclined... W m-2 NC_DOUBLE time, northing, easting
#> sdfd Incoming diffuse shortwave radiation W m-2 NC_DOUBLE time, northing, easting
#> tais Air temperature at 10 m above ground K NC_DOUBLE time, northing, easting
#> rhus Relative humidity at 10 m above ground % NC_DOUBLE time, northing, easting
#> pail Air pressure Pa NC_DOUBLE time, northing, easting
#> wnsd Wind speed at 10 m above ground m/s NC_DOUBLE time, northing, easting
#> wndd Wind direction at 10 m above ground deg NC_DOUBLE time, northing, easting
#> proi Precipitation mm/hr NC_DOUBLE time, northing, easting
#>
#> Attributes:
#> name type length value
#> title NC_CHAR 108 High-resolution hydrometeorological data for th...
#> summary NC_CHAR 244 This dataset contains high-resolution meteorolo...
#> keywords NC_CHAR 220 downwelling_longwave_flux_in_air; direct_downwe...
#> conventions NC_CHAR 17 ACDD 1-3; CF-1.11
#> creator_name NC_CHAR 27 Jan Magnusson; Tobias Jonas
#> creator_email NC_CHAR 34 [email protected]; [email protected]
#> creator_url NC_CHAR 76 https://orcid.org/0000-0003-0257-1862; https://...
#> institution NC_CHAR 49 WSL Institute for Snow and Avalanche Research SLF
#> project NC_CHAR 61 International Network for Alpine Research Catch...
#> geospatial_lat_min NC_DOUBLE 1 46.6734987688503
#> geospatial_lat_max NC_DOUBLE 1 46.8353396574981
#> geospatial_lon_min NC_DOUBLE 1 9.74259474964444
#> geospatial_lon_max NC_DOUBLE 1 9.98461064390458
#> date_created NC_CHAR 19 2024-08-08T23:22:05
#> history NC_CHAR 65 File created at 2024-08-08T23:22:05 using Matla...
# Open a data variable
(lwrc <- ds[["lwrc"]])
#> <Variable> lwrc
#> Long name: Incoming longwave radiation
#>
#> Axes:
#> axis name long_name length values unit
#> T time 24 [2020-03-12T00:00:00 ... 2020-03-12T23:00:00] days since 2020-03-12T00:00:00+01:00
#> northing Northing 180 [189950 ... 172050] m
#> easting Easting 185 [776050 ... 794450] m
#>
#> Attributes:
#> name type length value
#> _FillValue NC_DOUBLE 1 -9999
#> standard_name NC_CHAR 32 downwelling_longwave_flux_in_air
#> long_name NC_CHAR 27 Incoming longwave radiation
#> coverage_content_type NC_CHAR 11 modelResult
#> units NC_CHAR 5 W m-2
This all looks ok but the northing and easting axes are not recognised as Y and X. You need to remedy that manually:
lwrc$axes[["northing"]]$orientation <- "Y"
lwrc$axes[["easting"]]$orientation <- "X"
lwrc
#> <Variable> lwrc
#> Long name: Incoming longwave radiation
#>
#> Axes:
#> axis name long_name length values unit
#> T time 24 [2020-03-12T00:00:00 ... 2020-03-12T23:00:00] days since 2020-03-12T00:00:00+01:00
#> Y northing Northing 180 [189950 ... 172050] m
#> X easting Easting 185 [776050 ... 794450] m
#>
#> Attributes:
#> name type length value
#> _FillValue NC_DOUBLE 1 -9999
#> standard_name NC_CHAR 32 downwelling_longwave_flux_in_air
#> long_name NC_CHAR 27 Incoming longwave radiation
#> coverage_content_type NC_CHAR 11 modelResult
#> units NC_CHAR 5 W m-2
(Note that you have to do this only once: all nine data variables share the same instance of northing and easting and it is those objects that the $orentation applies to.)
Now you have a fully specified coordinate system. You can easily convert this data variable to a terra::SpatRaster:
(r <- lwrc$data()$terra())
#> class : SpatRaster
#> dimensions : 180, 185, 24 (nrow, ncol, nlyr)
#> resolution : 100, 100 (x, y)
#> extent : 776000, 794500, 172000, 190000 (xmin, xmax, ymin, ymax)
#> coord. ref. : lon/lat WGS 84 (EPSG:4326)
#> source(s) : memory
#> names : 2020-~00:00, 2020-~00:00, 2020-~00:00, 2020-~00:00, 2020-~00:00, 2020-~00:00, ...
#> min values : 173.8, 174.1, 174.4, 173.1, 177.0, 180.6, ...
#> max values : 223.3, 222.6, 221.5, 220.1, 221.8, 223.7, ...
All done, except for the CRS. Given the extent of the data variable, it is a bit odd that EPSG:4326 is assigned. You can set your own CRS with r$crs <- "epsg:21781" but it appears that terra does not recognise that even though it is a valid EPSG code.
In conclusion, this is an issue with the data file and not with terra.
in an upcoming GDAL release (3.12, a while until that's available generally) you might get away with
vrt://METEO_DATA_202003120000.NC?transpose=lwrc:2,1
as the input to rast() rather than the file name itself. With earlier GDAL you can use also use cli to do this with
gdalmdimtranslate in.nc out.nc -array "name=lwrc,transpose=[2,1,0]"
I suspect it would work better now if the file had a "grid_mapping" var for the crs. Can you point to this file so I can check this out? (I wonder if GDAL could benefit from a virtualization of 'grid_mapping', so we could inject what it should use via VRT)
(I found, I think, files here: https://www.envidat.ch/#/metadata/inarch-dischma-dataset)
Thanks for the feedback. Yes they comes from Dischma catchment meteorological dataset. At the end working directly within an R session, I opened each using directly ncdf4 directly, modified the attributes and the variables (see lines of code below) and then created a new netCDF file for each original one. The new one can be obened by _terra::rast_ as a raster map. I share the solution here:
#
# DISCHMA METEO DATASET
#
library(ncdf4)
library(terra)
library(stringr)
library(lubridate)
### DISCHMA DATASET: https://www.envidat.ch/#/metadata/inarch-dischma-dataset
meteo_dir <- ".../dischma/meteo/"
output_file_path <- ".../dischma_new/meteo/%s" ## C formatted string with %s
meteo_files <- meteo_dir |> list.files(pattern=".NC",full.names=TRUE)
## try the fist ten nc files:
meteo_files <- meteo_files[1:10] ## you can comment for all files
for (meteo_file in meteo_files) {
# Input and output file names
input_file <- meteo_file ##"METEO_DATA_202003120000.NC"
output_file <- input_file |> basename() |> sprintf(fmt=output_file_path) ##"METEO_DATA_202003120000_reordered.nc"
# Open the original NetCDF file
nc_in <- nc_open(input_file)
# Read dimension values
time_vals <- ncvar_get(nc_in, "time")
northing_vals <- ncvar_get(nc_in, "northing")
easting_vals <- ncvar_get(nc_in, "easting")
# Define new dimensions in the desired order
dim_time <- ncdim_def("time", ncatt_get(nc_in, "time", "units")$value, time_vals)
dim_easting <- ncdim_def("easting", ncatt_get(nc_in, "easting", "units")$value, easting_vals)
dim_northing <- ncdim_def("northing", ncatt_get(nc_in, "northing", "units")$value, northing_vals)
# List of variables to reorder
vars_to_reorder <- c("lwrc", "sdri", "sdfd", "tais", "rhus", "pail", "wnsd", "wndd", "proi")
# Define variables with reordered dimensions
var_defs <- lapply(vars_to_reorder, function(varname) {
var <- nc_in$var[[varname]]
ncvar_def(name = var$name,
units = ncatt_get(nc_in, varname, "units")$value,
dim = list(dim_easting, dim_northing,dim_time), ### EC 20250519
missval = var$missval,
prec = "double")
})
var_defs$crs <- ncvar_def(
name = "crs",
units = "",
dim = list(), # NUll dimension = scalare variable
longname = "Coordinate Reference System",
prec = "integer"
)
#ncvar_put(nc_out,crs_var, 1)
## Create new NetCDF file
nc_out <- nc_create(output_file, var_defs,force_v4=FALSE)
# Write dimension variables
ncvar_put(nc_out, "time", time_vals)
ncvar_put(nc_out, "easting", easting_vals)
ncvar_put(nc_out, "northing", northing_vals)
# Copy dimension attributes
for (dim_name in c("time", "easting", "northing")) {
att_names <- names(ncatt_get(nc_in, dim_name))
for (att in att_names) {
ncatt_put(nc_out, dim_name, att, ncatt_get(nc_in, dim_name, att)$value)
## PUT CRS
}
if (dim_name %in% c("northing","easting")) {
attrcrs <- ncatt_get(nc_in, dim_name, att)$value
epsg_code <- sub(".*\\(EPSG:(\\d+)\\).*", "\\1", attrcrs) |> as.integer()
###ncatt_put(nc_out,"crs","code" , epsg_code)
wkt <- sf::st_crs(epsg_code)$wkt
ncatt_put(nc_out,"crs","crs_wkt",wkt)
} else if (dim_name=="time") {
attrunits_time <- ncatt_get(nc_in, dim_name,"units")$value
t0 <- attrunits_time |> str_split("since") |> sapply(FUN=function(x){x[2]}) |>
str_trim() |> str_replace_all(":","") |>
as.POSIXct(format="%Y-%m-%dT%H%M%S%z",tz="UTC")
time_vals2 <- t0+seconds(round(time_vals*24*60*60))
t00 <- as.Date(time_vals2[3]) |> as.POSIXct(tz="UTC")
time_vals3 <- difftime(time_vals2,t00,units="secs") |> as.numeric()
ncvar_put(nc_out, "time", time_vals3)
attrunits_time2 <- "seconds since %s" |> sprintf(as.character(as.Date(t00)))
ncatt_put(nc_out, dim_name,"units",attrunits_time2)
}
}
ncatt_put(nc_out, "easting", "standard_name", "projection_x_coordinate")
ncatt_put(nc_out, "northing", "standard_name", "projection_y_coordinate")
# Write variables and copy all attributes
for (varname in vars_to_reorder) {
data <- ncvar_get(nc_in, varname)
data_reordered <- aperm(data, c(3, 2, 1)) # from (easting 3 , northing 2 , time 1 ) to (time 1 , easting 3 , northing2 )
ncvar_put(nc_out, varname, data_reordered)
# Copy all variable attributes
att_names <- names(ncatt_get(nc_in, varname))
for (att in att_names) {
ncatt_put(nc_out, varname, att, ncatt_get(nc_in, varname, att)$value)
}
ncatt_put(nc_out, varname,"grid_mapping","crs")
}
# Copy global attributes
global_atts <- names(ncatt_get(nc_in, 0))
for (att in global_atts) {
ncatt_put(nc_out, 0, att, ncatt_get(nc_in, 0, att)$value)
}
# Close files
nc_close(nc_in)
nc_close(nc_out)
cat("✅ NetCDF file successfully reordered to (time, easting, northing) and saved as:", output_file, "\n")
}
and then it works:
> output_file
[1] ".../dischma_new/meteo/METEO_DATA_201610100000.NC"
> rast(output_file)
class : SpatRaster
size : 180, 185, 216 (nrow, ncol, nlyr)
resolution : 100, 100 (x, y)
extent : 776000, 794500, 172000, 190000 (xmin, xmax, ymin, ymax)
coord. ref. : CH1903 / LV03 (EPSG:21781)
sources : METEO_DATA_201610100000.NC:lwrc (24 layers)
METEO_DATA_201610100000.NC:sdri (24 layers)
METEO_DATA_201610100000.NC:sdfd (24 layers)
... and 6 more sources
varnames : lwrc (Incoming longwave radiation)
sdri (Incoming direct shortwave radiation on inclined surface)
sdfd (Incoming diffuse shortwave radiation)
...
names : lwrc_1, lwrc_2, lwrc_3, lwrc_4, lwrc_5, lwrc_6, ...
unit : W m-2, W m-2, W m-2, W m-2, W m-2, W m-2, ...
time : 2016-10-09 23:00:00 to 2016-10-10 22:00:00 UTC (24 steps)
This is not a direct and internal issue of terra but I hope it can be useful in case on how to use some non-standard netCDF files. Thank you best @ecor