terra icon indicating copy to clipboard operation
terra copied to clipboard

Loading `sf` affects `terra`'s ability to read netcdf

Open dramanica opened this issue 1 year ago • 16 comments

I have stumbled across an odd interaction betwen sf and terra. If I create a raster with multiple time slices and save it as a netcdf AFTER loading sf:

library(terra)
library(sf)
a <- rast(ncols=5, nrows=5, nl=50)
values(a) <- 1:prod(dim(a))
time(a, tstep="years") <- rep(1985,50)
writeCDF(a,"b.nc",varname = "bio01",overwrite=TRUE,split=TRUE)
# read back the file
b <- rast("b.nc")

warnings are generated for every layer that is written/read:

Warning messages:
  1: In new_CppObject_xp(fields$.module, fields$.pointer, ...) :
  GDAL Message 1: dimension #0 (time) is not a Time or Vertical dimension.
2: In new_CppObject_xp(fields$.module, fields$.pointer, ...) :
  GDAL Message 1: dimension #0 (time) is not a Time or Vertical dimension.

If I repeat the same without loading sf, no warnings are issued. The code works correctly in both instances, the warnings are just cosmetic (times are set correctly despite the warnings). I am on Ubuntu 22.04, and I have tried with the dev version of terra.

dramanica avatar Jun 29 '23 13:06 dramanica

This is a bit of "bandaid" solution, but I can prevent the warnings when reading back the file by helping GDAL find the time axis with:

nc_in <- ncdf4::nc_open("b.nc", write=TRUE)
 ncdf4::ncatt_put(nc_in, varid="time", attname="axis", attval = "T")
 ncdf4::nc_close(nc_in)

I suppose we could add the axis attribute in writeCDF. But it is not really a "proper" solution to the problem.

dramanica avatar Jun 29 '23 15:06 dramanica

These are messages emitted by the GDAL library, a component sf and terra share; it looks like terra suppresses such messages, where sf emits them. Loading sf before terra lets terra get its way, and no warnings appear. I think this is something we should ideally sort out, as teaching users how loading order matters is pretty ugly.

I wonder whether this would also happen on platforms with static linking (Windows, MacOS) where sf and terra each come with their own copy of GDAL, which may not be shared at run-time.

edzer avatar Jul 03 '23 09:07 edzer

Interesting. I do not immediately see what terra does to suppress this (and that sf explicitly turns that back on?). I wondered whether it had to do with the "GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS" configuration option; but neither package sets that with

library(terra)
#terra 1.7.29
getGDALconfig("GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS")
#GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS
#                                    ""
library(sf)
#Linking to GEOS 3.11.2, GDAL 3.7.0, PROJ 9.2.0; sf_use_s2() is TRUE
getGDALconfig("GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS")
#GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS
#                                   ""

And setting the value to "YES" does not suppress the warnings

setGDALconfig("GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS", "YES")

I use :

gdal()
#[1] "3.7.0"
packageVersion("sf")
#[1] ‘1.0.13’

Perhaps it is different with your configuration

rhijmans avatar Jul 04 '23 00:07 rhijmans

This also triggers the warnings, without sf:

library(terra)
gdal(warn = 1)
a <- rast(ncols=5, nrows=5, nl=50)
values(a) <- 1:prod(dim(a))
time(a, tstep="years") <- rep(1985,50)
writeCDF(a,"b.nc",varname = "bio01",overwrite=TRUE,split=TRUE)
# read back the file
b <- rast("b.nc")
warnings()

edzer avatar Jul 04 '23 06:07 edzer

I can confirm that, on my Ubuntu22.04, setting gdal warnings to 1 gives me the same warnings with terra alone. However, it has not impact of the value of "GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS", which stays as "" no matter what (including when I load sf).

dramanica avatar Jul 04 '23 07:07 dramanica

Thanks @edzer, I should have known that!

rhijmans avatar Jul 04 '23 14:07 rhijmans

Why is

gdal(warn = 1)

not the default?

edzer avatar Jul 05 '23 15:07 edzer

I was looking into b.nc created above, and noted it has

// global attributes:
		:Conventions = "CF-1.4" ;
		:created_by = "R packages ncdf4 and terra (version 1.7-37)" ;

but the warning message emitted suggests it wasn't created with ncdf4.

edzer avatar Jul 05 '23 15:07 edzer

I think I did that because at the time my experience was that gdal(warn = 1) emits a lot of confusing messages with limited value. The ones discussed here are a case in point; but I do not know how general this is, really.

If it says that, the file was created by terra, using ncdf4, but, due to my ignorance, terra may not follow the convention to the letter.

rhijmans avatar Jul 05 '23 15:07 rhijmans

According to my reading of the CF standards, I think the current time units are not quite right, as the CF standard requires a date as a reference point even if the time is in years. So, if I read the CF standard correctly, it should be "years since 1970-01-01". However, gdal fails to read this without warnings even though it is correct. The CF standard also has the option to add an explicit axis attribute (with "axis=T" as mentioned earlier in this thread). gdal manages to sort itself out just with the axis, but I think it would make sense to fix the reference date as well to make the file fully cf compatible.

dramanica avatar Jul 05 '23 20:07 dramanica

Yes, this is where GDAL looks for "valid" time unit signatures, and years since is not in it - no idea if this is an omission or on purpose.

edzer avatar Jul 05 '23 20:07 edzer

Apart from "date/time" and "date" terra also supports the following time steps: "year", "yearmonth" (2001-01, 2001-02, etc), and "month" (1 to 12). In netCDF files I have seen e.g. "2001-01-15" for January 2001 and "2001-01-01" for "2001" and then use "days since" for offsets (as Andrea points out). I do not think that is good enough. "January 15" is a single day, whilst "January" is the entire month. Or is there a way within the netcdf standard (whichever version) to capture that distinction? If not, it might be a better choice to be non-compliant.

rhijmans avatar Jul 06 '23 03:07 rhijmans

Years and months are valid units, but tricky, in the CF standards, which is probably why gdal does not consider them. They don't really conform to the use of year and month in terra: "UDUNITS defines a year to be exactly 365.242198781 days (the interval between 2 successive passages of the sun through vernal equinox). It is not a calendar year. UDUNITS defines a month to be exactly year/12, which is not a calendar month. The CF standard follows UDUNITS in the definition of units, but we recommend that year and month should not be used, because of the potential for mistakes and confusion." So, one option is to be non-CF compliant, and simply help gdal by explicitly defining the dimensions with the "axis" attribute.

dramanica avatar Jul 06 '23 07:07 dramanica

Years and months are valid units, but tricky,

I read that in the udunits docs, and agree that "year" is a difficult unit, however specified dates defined as e.g. 1, 2, 3, 4, 5 years from 1970-01-01 are unambiguous because we know which year we are talking about (the periods inbetween have unequal duration).

@rhijmans CF defines bounds to specify what we call support: whether a time refers to a time instance or a period. Cell methods then can be used to specify the aggregation method, e.g. whether the mean or maximum over a period is reported.

Bounds are new variables, and the time instances do not have to be at the start or the middle of a bounds interval. (This explains much of the mess with lon/lat grids not having regular intervals: the bounds variables typically do).

edzer avatar Jul 06 '23 07:07 edzer

Do terra objects btw have a way to register whether, e.g. in case of yearly time steps, the values are associated with a time instance (Jan 1st 00:00) or with the entire year, and if the entire year how (e.g. mean, min or max?) You'd need this before you can write it out to CF. stars dimensions have a logical field point which can be TRUE, FALSE or NA, indicating point support if TRUE (time instance, spatial point) or block support if FALSE (time period, grid cell or polygon support). It doesn't have metadata for the aggregation method. The GDAL (gpkg, filegdb) data model for vector data has a flag "FieldDomain" that can carry this information (called split- and merge policy).

edzer avatar Jul 06 '23 11:07 edzer

I encounter the same warning with at least these three functions (terra::writeCDF, terra::rast, and plot) on a Linux HPC [OS: SUSE Linux Enterprise Server 15 SP4]

# terra::setGDALconfig("GDAL_NETCDF_IGNORE_XY_AXIS_NAME_CHECKS", "NO") # this does not help
require(terra)
require(sf)

Map_3035 <- terra::rast(
  nrows = 404, ncols = 390, nlyrs = 1, xmin = 2630000, xmax = 6530000, 
  ymin = 1380000, ymax = 5420000, crs = "epsg:3035", vals = 1, names = "Map1")

# A warning occurred at all of these functions
terra::writeCDF(Map_3035, filename = "Map_3035.nc", overwrite = TRUE)
terra::rast("Map_3035.nc")
plot(Map_3035)
# the plot function throws the warnings twice
Warning messages:
1: In x@cpp$readStart() :
  GDAL Message 1: dimension #1 (easting) is not a Longitude/X dimension.
2: In x@cpp$readStart() :
  GDAL Message 1: dimension #0 (northing) is not a Latitude/Y dimension.
3: In x@cpp$readStart() :
  GDAL Message 1: dimension #1 (easting) is not a Longitude/X dimension.
4: In x@cpp$readStart() :
  GDAL Message 1: dimension #0 (northing) is not a Latitude/Y dimension.
****

The order of loaded packages indeed affects the appearance of the warnings: sf / terra --- raster / terra --- terra / raster --- terra --> all these order of packages produced no warnings for me terra / sf --> warning

Although this warning can be annoying, is there anything of concern regarding the output? Is it possible to avoid this warning?

See also my question to stackoverflow.

elgabbas avatar Mar 04 '24 15:03 elgabbas