terra
terra copied to clipboard
std::bad_alloc with disaggregate
Hi, disclaimer - I'm not actually an R programmer so maybe I'm missing something simple - probably I'm disaggregating in a wrong way?
I'm using terra to work with hyperspectral rasters of geological material (sediment cores). These are pretty big files with around 400 to almost 500 wavelengths (so number of layers) that can easily reach more than 60 GB. Terra is perfect solution, as operations with that kind of data in memory are rather hard and we are aiming at different hardware setups - much appreciation for your hard work devs!
At one point so called white reference (small white object) has to match exactly extent of captured data (long sediment core). We achieve it by first cropping it to the same width, then aggregating reference file into one mean row, and then disaggregating this mean row it to match captured data extent and cell size.
This works fine with smaller subsets and datasets of approx. 400 lyrs, but with datasets of approx. 500 lyrs it cannot allocate space. With subsets around 50-100 lyrs it works really fast.
I'm not sure how to provide more reproducible example, because failure occurs with very large datasets.
I tried using different tmp directories on SSD drives and making sure there is more than required space available, and increasing number of steps with: terra::terraOptions(steps = 1000, tempdir = "e:/tmp").
These are the custom functions that are used together:
raster_crop <- function(raster, type, roi, ref_type) {
# If cropping entire capture SpatRaster use entire large ROI
if (type == "capture") {
raster <- terra::crop(raster, roi, filename = paste0(paths[["directory"]], "products/capture_cropped.tif"), overwrite = TRUE)
# If cropping reference SpatRaster use only xmin and xmax from large ROI
} else if (ref_type == "whiteref"){
raster <- terra::crop(raster, c(terra::xmin(roi), terra::xmax(roi), terra::ymin(raster), terra::ymax(raster)), filename = paste0(paths[["directory"]], "products/whiteref_cropped.tif"), overwrite = TRUE)
} else {
raster <- terra::crop(raster, c(terra::xmin(roi), terra::xmax(roi), terra::ymin(raster), terra::ymax(raster)), filename = paste0(paths[["directory"]], "products/darkref_cropped.tif"), overwrite = TRUE)
}
# Return raster
return(raster)
}
create_reference_raster <- function(raster, roi, ref_type) {
if (ref_type == "whiteref"){
name <- "whiteref"
} else {
name <- "darkref"
}
# Aggregate data into one row SpatRaster, "divide" by number of rows
raster <- terra::aggregate(raster, fact = c(terra::nrow(raster), 1), fun = "mean", filename = paste0(paths[["directory"]], "products/", name, "_agg.tif"), overwrite = TRUE)
# Set new extent to match extent of capture SpatRaster
terra::ext(raster) <- roi
# Disaggregate data over entire extent to mach capture SpatRaster extent, "multiply" by ymax
raster <- terra::disagg(raster, fact = c(terra::ymax(raster), 1), filename = paste0(paths[["directory"]], "products/", name, "_disagg.tif"), overwrite = TRUE)
# Return raster
return(raster)
}
Chained together
big_roi <- terra::ext(c(340, 1560, 700, 34775))
whiteref <- terra::rast("capture/WHITEREF_PRI_22_PRI_22_AB_2023-01-16_13-44-50.raw")
whiteref <- raster_crop(raster = whiteref, type = "reference", roi = big_roi, ref_type = "whiteref") |>
create_reference_raster(raster = _, roi = big_roi, ref_type = "whiteref")
filename : products/whiteref_cropped.tif
compute stats : 1, GDAL: 0, minmax: 0, approx: 1
driver : GTiff
disk available: 1530.4 GB
disk needed : 0.1 GB
memory avail. : 23.92 GB
memory allow. : 14.35 GB
memory needed : 1.731 GB (4 copies)
in memory : true
block size : 100 rows
n blocks : 100
pb : 3
filename : products/whiteref_agg.tif
compute stats : 1, GDAL: 0, minmax: 0, approx: 1
driver : GTiff
disk available: 1530.3 GB
disk needed : 0 GB
memory avail. : 23.69 GB
memory allow. : 14.22 GB
memory needed : 0.017 GB (4 copies)
in memory : true
block size : 1 rows
n blocks : 1
pb : 900
filename : products/whiteref_disagg.tif
compute stats : 1, GDAL: 0, minmax: 0, approx: 1
driver : GTiff
disk available: 1538 GB
disk needed : 75.2 GB
memory avail. : 23.64 GB
memory allow. : 14.18 GB
memory needed : 1.04646e+07 GB (69550 copies)
in memory : false
block size : 1 rows
n blocks : 34775
pb : 3
This leads to
Error: std::bad_alloc
6. stop(structure(list(message = "std::bad_alloc", call = NULL,
cppstack = NULL), class = c("std::bad_alloc", "C++Error",
"error", "condition")))
5. x@ptr$disaggregate(fact, opt)
4. .local(x, ...)
3. terra::disagg(raster, fact = c(terra::ymax(raster), 1), filename = paste0(paths[["directory"]],
"products/", name, "_disagg.tif"), overwrite = TRUE)
2. terra::disagg(raster, fact = c(terra::ymax(raster), 1), filename = paste0(paths[["directory"]],
"products/", name, "_disagg.tif"), overwrite = TRUE)
1. create_reference_raster(raster = raster_crop(raster = whiteref,
type = "reference", roi = big_roi, ref_type = "whiteref"),
roi = big_roi, ref_type = "whiteref")
I'm working on Windows 11 machine with 32 GB RAM.
> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)
Matrix products: default
locale:
[1] LC_COLLATE=Polish_Poland.utf8 LC_CTYPE=Polish_Poland.utf8 LC_MONETARY=Polish_Poland.utf8 LC_NUMERIC=C LC_TIME=Polish_Poland.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] Rcpp_1.0.10 codetools_0.2-18 pak_0.4.0 fansi_1.0.4 withr_2.5.0 utf8_1.2.2 dplyr_1.1.0 R6_2.5.1 lifecycle_1.0.3 magrittr_2.0.3 pillar_1.8.1 rlang_1.0.6 cli_3.6.0
[14] rstudioapi_0.14 vctrs_0.5.2 generics_0.1.3 tools_4.2.2 glue_1.6.2 purrr_1.0.1 compiler_4.2.2 pkgconfig_2.0.3 terra_1.7-3 tidyselect_1.2.0 tibble_3.1.8
Can you confirm that the below are the relevant steps in your script to get to the error? Here I am using a SpatRaster with no cell values, so the memory problem will not occur.
wref <- rast(ncol=1600, nrow=35000, nlyr=3, xmin=0, xmax=1600, ymin=0, ymax=35000)
roi <- ext(c(340, 1560, 700, 34775))
e <- c(as.vector(roi)[1:2], as.vector(ext(wref)[1:2]))
x <- crop(wref, e)
x <- aggregate(x, fact=c(nrow(x), 1), fun = "mean")
ext(x) <- roi
y <- disagg(x, fact=c(ymax(x), 1))
#class : SpatRaster
#dimensions : 34775, 1220, 3 (nrow, ncol, nlyr)
#resolution : 1, 0.9798706 (x, y)
#extent : 340, 1560, 700, 34775 (xmin, xmax, ymin, ymax)
#coord. ref. :
And can you please show(whiteref) so that it is clearer what you are dealing with?
Hi,
I run your code with my whiteref and as you wrote, these are relevant steps leading to the std::bad_alloc
wref <- terra::rast("...capture/WHITEREF_PRI_22_PRI_22_AB_2023-01-16_13-44-50.raw")
roi <- ext(c(340, 1560, 700, 34775))
e <- c(as.vector(roi)[1:2], as.vector(ext(wref)[1:2]))
x <- crop(wref, e)
x <- aggregate(x, fact=c(nrow(x), 1), fun = "mean")
ext(x) <- roi
y <- disagg(x, fact=c(ymax(x), 1))
Error: std::bad_alloc---------|---------|
Raw white ref looks like this show(whiteref)
class : SpatRaster
dimensions : 100, 2184, 476 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 2184, 0, 100 (xmin, xmax, ymin, ymax)
coord. ref. :
source : WHITEREF_PRI_22_PRI_22_AB_2023-01-16_13-44-50.raw
names : 397.65, 398.84, 400.02, 401.21, 402.39, 403.58, ...