terra icon indicating copy to clipboard operation
terra copied to clipboard

indicate the presence of geolocation arrays (and GCPs and RPCs)

Open mdsumner opened this issue 7 months ago • 5 comments

these can be leveraged by passing through the warp api (before anything else is done) e.g. under "Geolocation:" (limited by GDAL currently needing to wrap into -180,180 but still common and useful)

gdalinfo vrt:///vsicurl/https://gws-access.jasmin.ac.uk/public/polarres/MetUM_PolarRES/Antarctic/daily/hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc?bands=1
Warning 1: Unhandled X/Y axis unit degrees. SRS will ignore axis unit and be likely wrong.
Driver: VRT/Virtual Raster
Files: /vsicurl/https://gws-access.jasmin.ac.uk/public/polarres/MetUM_PolarRES/Antarctic/daily/hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc
Size is 660, 531
Metadata:
...

Geolocation:
  GEOREFERENCING_CONVENTION=PIXEL_CENTER
  LINE_OFFSET=0
  LINE_STEP=1
  PIXEL_OFFSET=0
  PIXEL_STEP=1
  SRS=GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
  X_BAND=1
  X_DATASET=NETCDF:"/vsicurl/https://gws-access.jasmin.ac.uk/public/polarres/MetUM_PolarRES/Antarctic/daily/hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc":longitude
  Y_BAND=1
  Y_DATASET=NETCDF:"/vsicurl/https://gws-access.jasmin.ac.uk/public/polarres/MetUM_PolarRES/Antarctic/daily/hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc":latitude
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  531.0)
Upper Right (  660.0,    0.0)
Lower Right (  660.0,  531.0)
Center      (  330.0,  265.5)
Band 1 Block=660x531 Type=Float64, ColorInterp=Undefined
  NoData Value=-1073741824
...

It would be great if the presence of Geolocation arrays, GCPs, or RPCs in the dataset were indicated in the print summary, then we can stream through the warp api (as simple as project(rast, <crs>) as a subset of the broader warp heuristics and partial-definition ability in the API. When terra warns about no extent found would be a good time to check this and hint towards warp-resolving.

related to

https://github.com/rspatial/terra/issues/107

mdsumner avatar May 02 '25 22:05 mdsumner

Do you have a file where this is needed? That is, with your example file (after downloading) I get:

library(terra)
f <- "hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc" 
r <- rast(f)
#terra 1.8.52

r
#class       : SpatRaster 
#size        : 531, 660, 365  (nrow, ncol, nlyr)
#resolution  : 0.1, 0.1  (x, y)
#extent      : 144, 210, -28.1, 25  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
#source      : hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc:surface_upward_latent_heat_flux 
#varname     : surface_upward_latent_heat_flux (surface_upward_latent_heat_flux) 
#names       : surfa~lux_1, surfa~lux_2, surfa~lux_3, surfa~lux_4, surfa~lux_5, surfa~lux_6,   ... 
#unit        : W m-2 
#time (days) : 2000-01-02 to 2000-12-31 (365 steps) 

The extent does not come from GDAL, and the CRS is guessed; but this is otherwise correct?

With the experimental multidim interface, we get the extent, but still need to guess the CRS

x <- terra:::multi(f)
#class       : SpatRaster 
#size        : 531, 660, 365  (nrow, ncol, nlyr)
#dimensions  : grid_longitude, grid_latitude, time (660, 531, 365}
#resolution  : 0.1, 0.1  (x, y)
#extent      : 144, 210, -28.1, 25  (xmin, xmax, ymin, ymax)
#coord. ref. : lon/lat WGS 84 (CRS84) (OGC:CRS84) 
#source      : hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc 
#varname     : surface_upward_latent_heat_flux (surface_upward_latent_heat_flux) 
#names       : surfa~lux-1, surfa~lux-2, surfa~lux-3, surfa~lux-4, surfa~lux-5, surfa~lux-6,   ... 
#unit        : W m-2 
#time (days) : 2000-01-02 to 2000-12-31 (365 steps) 

Or do I miss something?

rhijmans avatar May 26 '25 21:05 rhijmans

Huh, I don't expect that. Certainly multi will do a better analysis than the drivers in 2D mode, but this doesn't have a longlat raster in the file. Apart from multi() is this from code you changed in terra? Your first open would have to be via the warper, unless I mistake something?

(I'll get deeper into this again that's just my first thought.)

mdsumner avatar May 26 '25 21:05 mdsumner

Oh, ok - I hadn't looked that far. The 144, 210, -28.1, 25 is entirely incorrect. (plot it, it's a free floating Antarctica in polar aspect with rotated pole).

So, that's a problem for GDAL that I will pursue, it should not be getting a longlat extent like that.

I'll find a better set of examples (geolocation, gcp, and rpc).

mdsumner avatar May 26 '25 22:05 mdsumner

Apart from guessing the CRS, I did not do anything to rast to read this file better. GDAL did not report an extent --- it was computed by rast by opening the file and using logic from the raster package With the multidim GDAL does not do extents, so again it is my probably my misinterpretation.

rhijmans avatar May 26 '25 23:05 rhijmans

The dataset uses a rotated latitude-longitude grid, based on the general oblique transformation. Most netCDF files that use this CRS also include auxiliary coordinate variables for latitude and longitude. With package ncdfCF I see the following:

library(ncdfCF)

ds <- open_ncdf("~/hfls_ANT-11_ERA5_evaluation_r1i1p1f1_BAS_MetUM_v1-r1_day_20000101_20001231.nc")

(sulhf <- ds[["surface_upward_latent_heat_flux"]])
#> <Variable> surface_upward_latent_heat_flux 
#> 
#> Coordinate reference system:
#>  name                       grid_mapping              
#>  rotated_latitude_longitude rotated_latitude_longitude
#> 
#> Axes:
#>  axis name           length unlim values                                        unit
#>  X    grid_longitude 660          [144.05 ... 209.95]                           degrees
#>  Y    grid_latitude  531          [-28.05 ... 24.95]                            degrees
#>  T    time           365    U     [2000-01-02T12:00:00 ... 2000-12-31T12:00:00] days since 2000-1-1
#> 
#> Auxiliary longitude-latitude grid:
#>  axis name      extent                unit         
#>  X    longitude [-339.819 ... 20.000] degrees_east 
#>  Y    latitude  [-90.000 ... -43.915] degrees_north
#> 
#> Attributes:
#>  name          type      length value                          
#>  standard_name NC_CHAR   31     surface_upward_latent_heat_flux
#>  long_name     NC_CHAR   31     surface_upward_latent_heat_flux
#>  units         NC_CHAR    5     W m-2                          
#>  grid_mapping  NC_CHAR   26     rotated_latitude_longitude     
#>  coordinates   NC_CHAR   18     latitude longitude             
#>  _FillValue    NC_DOUBLE  1     -1073741824                    
#>  missing_value NC_DOUBLE  1     -1073741824                    
#>  cell_methods  NC_CHAR   10     time: mean                     
#>  runid         NC_CHAR    5     aaaaa                          
#>  lbproc        NC_CHAR    3     128                            
#>  lbtim         NC_CHAR    3     121                            
#>  submodel      NC_CHAR    1     1

# The grid mapping
sulhf$crs
#> <Grid mapping> rotated_latitude_longitude 
#> Grid mapping name: rotated_latitude_longitude 
#> 
#> Attributes:
#>  name                      type      length value                     
#>  grid_mapping_name         NC_CHAR   26     rotated_latitude_longitude
#>  grid_north_pole_latitude  NC_DOUBLE  1     5                         
#>  grid_north_pole_longitude NC_DOUBLE  1     20

# Rectify to lat-lon
(sul_geo <- sulhf$subset(longitude = NA, latitude = NA))
#> <Data array> surface_upward_latent_heat_flux 
#> 
#> Values: [-140.3333 ... 302.5833] W m-2
#>     NA: 82618480 (30.8%)
#> 
#> Axes:
#>  axis name      length values                                        unit
#>  X    longitude 1596   [-339.706038 ... 19.811747]                   degrees
#>  Y    latitude   460   [-89.95 ... -44.05]                           degrees
#>  T    time       365   [2000-01-02T12:00:00 ... 2000-12-31T12:00:00] days since 2000-1-1
#> 
#> Attributes:
#>  name          type      length value                          
#>  standard_name NC_CHAR   31     surface_upward_latent_heat_flux
#>  long_name     NC_CHAR   31     surface_upward_latent_heat_flux
#>  units         NC_CHAR    5     W m-2                          
#>  _FillValue    NC_DOUBLE  1     -1073741824                    
#>  missing_value NC_DOUBLE  1     -1073741824                    
#>  cell_methods  NC_CHAR   10     time: mean                     
#>  runid         NC_CHAR    5     aaaaa                          
#>  lbproc        NC_CHAR    3     128                            
#>  lbtim         NC_CHAR    3     121                            
#>  submodel      NC_CHAR    1     1                              
#>  actual_range  NC_DOUBLE  2     -140.333333, 302.583333

Note the peculiar longitude range (for this reason ncdfCF can't produce this out-of-the-box - I had to comment out some checks and rebuild to let this pass).

pvanlaake avatar Aug 13 '25 19:08 pvanlaake