terra icon indicating copy to clipboard operation
terra copied to clipboard

spatSample() slowness

Open mdsumner opened this issue 2 years ago • 4 comments

spatSample() is very slow for this moderate sized ~COG~ GeoTIFF (there's no overviews)

## 2268, 2193, 3  (nrow, ncol, nlyr)
dsn <- "/vsicurl/https://github.com/mdsumner/cog-example/raw/main/cog/sentinel-image.tif"

library(terra)
r <- rast(dsn)

this spatSample call takes forever, and in particular affects the performance of plotRGB(r) :

spatSample(r, 500000, method = "regular", as.raster = TRUE, warn = F)

in fact it's better to do plotRGB(r, maxcell = Inf) but that's obviously a bad idea for very much larger sources.

I can't see why since it's a standard call to RasterIO but I haven't explored more. Using RasterIO outside of terra works quickly as expected.

These are also fast, read the entire thing into memory, or sample to very much smaller:

r.mem <- r * 1
spatSample(r, c(20, 20), method = "regular", as.raster = TRUE, warn = F)

mdsumner avatar Nov 23 '23 21:11 mdsumner

We can also download the entire file (by removing the vsicurl) without loading it into memory and it works ok:

system.time({
  dsn <- "https://github.com/mdsumner/cog-example/raw/main/cog/sentinel-image.tif"
  r <- rast(dsn)
  spatSample(r, 500000)
})
#> user  system elapsed 
#> 0.21    0.08    4.08

kadyb avatar Nov 23 '23 22:11 kadyb

I think it's the OVERVIEW_LEVEL=NONE setting, and appears to be a GDAL problem. The slowness is entirely in RasterIO(), I will follow up with GDAL itself, I could PR to avoid setting that open option if no overviews exist, or perhaps just remove it.

I can reproduce the slowness problem with sf like this:

d2 <- c(219, 226)
dsn <- "/vsicurl/https://github.com/mdsumner/cog-example/raw/main/cog/sentinel-image.tif"

# ## uses RasterIO via sf (equivalent to the spatSample() call above if we include this open option
sf::gdal_utils("translate", dsn, tf <- tempfile(tmpdir = "/vsimem", fileext = ".tif"), options = c("-outsize", d2[1], d2[2], "-ot", "Float64",  "-oo", "OVERVIEW_LEVEL=NONE"))
##a2 <- rast(stars::read_stars(tf))

mdsumner avatar Nov 24 '23 23:11 mdsumner

also, it's not a COG - there's no overviews, an oversight on my part

mdsumner avatar Nov 25 '23 00:11 mdsumner

lovely and fast now as expected since 42ce57ea1edea81f6c4d2e0eb80aa9af3a2beecf

dsn <- "/vsicurl/https://github.com/mdsumner/cog-example/raw/main/cog/sentinel-image.tif"
library(terra)
r <- rast(dsn)
plotRGB(r)

(will be backported to GDAL 3.8.0 by the looks).

there could be some changes in terra, but ultimately I think these RasterIO calls should be replaced by GDALWarp() which will target the right overview, and also work for very large rasters - perhaps terra doesn't want to allow that, or perhaps maxcell should include some way to allow choice of overviews ... so will leave this open for discussion.

mdsumner avatar Nov 25 '23 07:11 mdsumner