vapour icon indicating copy to clipboard operation
vapour copied to clipboard

converting from vapour to gdalraster

Open mdsumner opened this issue 10 months ago • 2 comments

vapour warp functions return bands in a list with native type

  • unless scaling is present see #226
  • unless band_output_type is set (overriden by scaling logic tho)
  • with attributes, dimension, extent, projection
  • native type is only supported as Byte->raw, (U)Int*->integer, Float*->double (complex/Int64 unclear atm ...)

gdalraster read_ds function return bands in a flat vector (not a list)

  • byte is converted to integer not raw
  • all int types are integer, float->numeric, (what about int64, complex ...)
  • attribute "gis" has properties bbox, dim (the 3D array shape), srs

unsure

what about complex and int64 above?

ximage supports read_ds type, but I'm unclear how that was navigated ... with flat band output matrix(, nrow = dim[2], byrow = TRUE) we trivially get rasterImage/ximage form for the bbox, but with 3D there's no "byrow" for arrow, so either split and bind or array(, dim) and aperm

the 2D flat band vs 3D flat array thing is interesting, and reading dataset-as-a-whole is supported in GDAL as is per band-specified reads

I have some nascent "nara" support in vapour, so we return a nativeRaster from C++ rather than 1, 3, or 4 Byte (or integer) bands, this is super fast for vis with rasterImage, and more compact than integer vectors, and coolbutuseless has helpers)

Rationale ...

@ctoney just keeping notes while I replace vapour workflows with gdalraster, I think we have an opportunity to build a lightweight convention for GDAL output and abstract representation of a grid that is independent of any libs (but I don't yet see the whole picture).

I'm interested in the cell abstraction functions and affine tools that GDAL and related packages have, I have the raster-cell functions separated out in hypertidy/vaster but don't yet have a strong plan to formalize around that (it could be a small C++ lib, a simple header include for other projects).

My example code now has this style with gdalraster:

## worker chunk  of warpfun() that does a single warp task 
 gdalraster::warp(x, tf <- tempfile(fileext = ".tif", tmpdir = "/vsimem"), t_srs = crs, cl_arg = cl_arg, quiet = TRUE)

  ds <- new(gdalraster::GDALRaster, tf)
  dat <- gdalraster::read_ds(ds)
  ds$close()
  gdalraster::vsi_unlink(tf)

## the code that invokes warpfun ($red/green/blue/cloud are Sentinel scenes in an index)
check <- gdalraster::buildVRT(visual <- tempfile(fileext = ".vrt", tmpdir = "/vsimem"),
                                  sprintf("/vsicurl/%s", c(src$red, src$green, src$blue, src$cloud)), cl_arg = "-separate", quiet = TRUE)

 vis <- warpfun(visual, dim = dim, ext = ext, crs = crs)

## ... now we do some pixel  filtering and processing in whatever scheme we like

I was juggling between the attribute types of vapour and gdalraster as described above, but largely this workflow doesn't need them - I want ways to minimize the churn so I'm writing down these thoughts. Input dim and extent don't need to be checked from the warper (what you put in is what you get out, within numeric support), but for example we have have dim = c(0, 1024) or dim = c(1024, 0) for windowed reads that GDAL will ifgure out the second shape from the aspect ratio (of the source data or of the '-te' target extent).

mdsumner avatar Apr 09 '24 22:04 mdsumner

the vapour code was more like this, I got in trouble with #226 and so it prompted me to try gdalraster rather than pursue that issue for now

## worker chunk of warpfun()
file <- vapour::gdal_raster_data(x, bands = 1:3 target_dim = dim,
target_crs = crs, target_ext = ext,
options = c("-wo", "NUM_THREADS=ALL_CPUS", "-multi"), band_output_type = "integer")

# the code that invokes warpfun ($red/green/blue/cloud are Sentinel scenes in an index)
visual <- vapour:::buildvrt(sprintf("/vsicurl/%s", c(src$red, src$green, src$blue)))
scl <- sprintf("/vsicurl/%s", src$cloud)
vis <- warpfun(visual, dim = dim, ext = ext, crs = crs)
sl <- warpfun(scl, bands = 1, dim = dim, ext = ext, crs = crs)

mdsumner avatar Apr 09 '24 22:04 mdsumner

read_ds() does have argument as_list if needed.

R complex is returned if the raster has a complex data type. Note that UInt32 is returned as double:

Data are read as R integer type when possible for the raster data type (Byte, Int8, Int16, UInt16, Int32), otherwise as type double (UInt32, Float32, Float64).

From this I noticed that the documentation currently does not mention int64 types, but a warning is emitted if a raster with int64 data type is opened:

Int64/UInt64 raster data types are not fully supported. Loss of precision will occur for values > 2^53. Int64/UInt64 raster data are currently handled as 'double'.

The gdalvector branch now links to RcppInt64 which provides conversion to/from bit64::integer64. FIDs and vector attribute fields of OFTInteger64 are now returned as S3 type integer64. Similar support for raster int64 should be added eventually.

ctoney avatar Apr 10 '24 02:04 ctoney