terra icon indicating copy to clipboard operation
terra copied to clipboard

terra::mosaic performs less efficiently than raster::mosaic with a list of rasters

Open Jean-Romain opened this issue 2 years ago • 3 comments

When running mosaic on a relatively large list of in-memory rasters terra is much slower than raster and uses a lot of memory (the example below goes beyond my RAM capacity with terra). It works smoothly with raster. Two minimal reproducible examples with raster and terra below.

raster

# Generate a set of 11 * 11 tiles  rasters
x = seq(0,1000, 100)
y = seq(0,1000, 100)
res = vector("list", 11*11)
k = 1
for(i in seq_along(x))
{
  for (j in seq_along(y))
  {
    res[[k]] = raster::extent(x[i], x[i]+100, y[j], y[j] + 100)
    k = k+1
  }
}

rasters = lapply(res, raster::raster, res = 0.5)
rasters = lapply(rasters, raster::init, fun = runif)

# mosaic the list
system.time({
final <- do.call(raster::mosaic, c(rasters, fun = mean))
})
#> 3.6 seconds

terra

# Generate a set of 11 * 11 tiled rasters
x = seq(0,1000, 100)
y = seq(0,1000, 100)
res = vector("list", 11*11)
k = 1
for(i in seq_along(x))
{
  for (j in seq_along(y))
  {
    res[[k]] = terra::ext(x[i], x[i]+100, y[j], y[j] + 100)
    k = k+1
  }
}

rasters = lapply(res, terra::rast, res = 0.5)
rasters = lapply(rasters, terra::init, fun = runif)

# mosaic the list
system.time({
  final <- do.call(terra::mosaic, rasters)
})
#> 21 seconds

I also tested with terra::sprc but it performs like do.call maybe worst

rasters = terra::sprc(rasters)

system.time({
terra::mosaic(rasters)
})

Jean-Romain avatar Mar 21 '22 16:03 Jean-Romain

What is slow in the previous example becomes a blocking issue when upscaled.

The merging/mosaicking process now causes a bad allocation error which does not allow to get to an output:

> terra::merge(sprc(raster.list), filename=file.path(out.folder,"raster",paste0("Intensity_",gridResFine,"m_meanAllReturns.tif")),
+              wopt=list(gdal=c("COMPRESS=LZW", "TFW=YES","of=GTiff")), overwrite=TRUE)
Error in x@ptr$merge(opt) : std::bad_alloc


> alternative2 = sprc(raster.list)
> terra::merge(alternative2, filename=file.path(out.folder,"raster",paste0("Intensity_",gridResFine,"m_meanAllReturns.tif")),
+              wopt=list(gdal=c("COMPRESS=LZW", "TFW=YES","of=GTiff")), overwrite=TRUE)
Error in x@ptr$merge(opt) : std::bad_alloc
In addition: Warning message:
Unable to open /projectname/output/raster/Intensity_1m_meanAllReturns.tif to obtain file list. (GDAL error 4) 

In my case, raster.list refers to a list of 106 rasters with 1m resolution derived from tiles of 500x500m: it was working smoothly on lidR 3 running on W10 with 16Gb of RAM, now gets stuck. I upgraded to 32Gb but it does not change.

I adapted the example proposed by @Jean-Romain to a similar case and it's possible to replicate the error:

raster

# Generate a set of 11 * 11 tiles  rasters
x = seq(0,5000, 500)
y = seq(0,5000, 500)
res = vector("list", 11*11)
k = 1
for(i in seq_along(x))
{
  for (j in seq_along(y))
  {
    res[[k]] = raster::extent(x[i], x[i]+500, y[j], y[j] + 500)
    k = k+1
  }
}

rasters = lapply(res, raster::raster, res = 1)
rasters = lapply(rasters, raster::init, fun = runif)

# mosaic the list
system.time({
  final <- do.call(raster::mosaic, c(rasters, fun = mean))
})

#> 10.5 seconds

terra

x = seq(0,5000, 500)
y = seq(0,5000, 500)
res = vector("list", 11*11)
k = 1
for(i in seq_along(x))
{
  for (j in seq_along(y))
  {
    res[[k]] = terra::ext(x[i], x[i]+500, y[j], y[j] + 500)
    k = k+1
  }
}

rasters = lapply(res, terra::rast, res = 1)
rasters = lapply(rasters, terra::init, fun = runif)

# mosaic the list
system.time({
  final <- do.call(terra::mosaic, rasters)
})

spono avatar Aug 22 '22 13:08 spono

@spono, did you try to use virtual raster and then save as a single file? For this, tiles must be stored on disk, not in memory.

kadyb avatar Aug 22 '22 13:08 kadyb

thank you @kadyb, yes: I took that approach as a temporary workaround but, considering the comparison with raster 's capabilities, looks like a regression to me. Should such approach be considered as a new "standard"?

spono avatar Aug 22 '22 13:08 spono

Sorry that this one took so long. I think that the terra::merge and terra::mosaic methods now are about 20 to 40 times faster than the raster variants. It is also faster (and simpler) with a SpatRasterCollection than with do.call.

rhijmans avatar Oct 17 '22 05:10 rhijmans