`terra::rast()` does not correctly interpret a NetCDF file compared to `ncdf4`
Hello,
I have encountered an issue with the terra package while trying to read a NetCDF file. The file in question can be downloaded from the following link: https://doi.org/10.3929/ethz-b-000648738.
When using terra::rast() to read the NetCDF file, the function does not correctly interpret the file's structure. Specifically:
1. terra::rast() identifies only 2 variables in the file, whereas the ncdf4::nc_open() function correctly identifies 4 variables.
2. The ncdf4::nc_open() shows that the file includes a time dimension, butterra::rast()does not recognize the time dimension at all.
For your reference, here is the code I used to read the file with both packages:
library(ncdf4)
library(terra)
# Reading with ncdf4
nc_ncdf4 <- nc_open("D:/GRACE-SeDA_v1_2002_2022.nc")
print(nc_ncdf4)
# Reading with terra
nc_terra <- rast("D:/GRACE-SeDA_v1_2002_2022.nc")
print(nc_terra)
and here are the results:
> print(nc_ncdf4)
File D:/GRACE-SeDA_v1_2002_2022.nc (NC_FORMAT_NETCDF4_CLASSIC):
4 variables (excluding dimension variables):
byte mask[latitude,longitude] (Contiguous storage)
units: -
description: The mask showing the valid data points. 1 = valid. 0 = invalid.
float twsa[time,latitude,longitude] (Contiguous storage)
units: mm
description: Total water storage anomalies in the format of equivalent water height.
float twsa_baseline[latitude,longitude] (Contiguous storage)
units: mm
description: The temporal mean of total water storage anomalies from 2004.000 to 2009.999 which has been removed from the original outputs of the deep learning model.
float uncertainty[time,latitude,longitude] (Contiguous storage)
units: mm
description: 1-sigma uncertianties estimated based on deep ensemble and Monte-Carlo simulations.
3 dimensions:
time Size:216
units: Modified Julian Date
latitude Size:360
units: degrees
description: The latitudes of the center of each 0.5-degree cells.
longitude Size:720
units: degrees
description: The lontitudes of the center of each 0.5-degree cells.
6 global attributes:
title: GRACE-SeDA: A global total water storage anomaly product with a spatial resolution of 0.5 degrees from self-supervised data assimilation
institution: Institute of Geodesy and Photogrammetry, ETH Zurich
authors: Junyang Gou and Benedikt Soja
contact: Junyang Gou; [email protected]
license: CC-BY: Creative Commons Attribution 4.0 International
history: 2024.05.16: Create the first version forced by WaterGAPv2.2e-GSWP3-ERA5.
> print(nc_terra)
class : SpatRaster
dimensions : 720, 360, 2 (nrow, ncol, nlyr)
resolution : 0.5, 0.5 (x, y)
extent : -90, 90, -180, 180 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs
sources : GRACE-SeDA_v1_2002_2022.nc:mask
GRACE-SeDA_v1_2002_2022.nc:twsa_baseline
varnames : mask
twsa_baseline
names : mask, twsa_baseline
unit : -, mm
Could you please investigate this issue or provide guidance on how to correctly read this NetCDF file using the terra package?
Thank you for your attention and assistance.
Best regards,
Jintao
Hi Jintao,
From the print of the file contents it is clear that the axes of the data sets are reversed from the recommendations in the CF Metadata Conventions. While the file does not claim to adhere to the CF conventions, most software for reading netCDF files does. That apparently includes GDAL library, which reads the file for terra. It does manage to identify the axes for variables mask and twsa_baseline, probably by their names, but the 3D variables twsa and uncertainty have a time dimension which is apparently complicating things too much.
It does not stop there. The unit of degrees for the longitude and latitude dimensions should not be used, but rather degrees_east and degrees_north, respectively, as per the CF conventions.
The time unit of Modified Julian Date is also not standard and not described in the readme.txt file. The values obviously represent days from some baseline date but beyond that it's a mystery. Successive data points are mostly around 30 days apart, but it ranges from 11 to 370! From the readme.txt file it is clear that these are months but what are "the 216 valid months when official GRACE(-FO) products are avaliable"?
You can load the data into terra by using ncdf4 to read the arrays of the variables, transpose and flip the arrays as required, and then do terra::rast() on each array. For the time dimension I would suggest that you reach out to the authors of the data set to understand how to map the dimensions values to the "216 valid months".