terra
terra copied to clipboard
Memory problems processing folders of files from list
I'm trying to process a few folders of spatial data but am running into memory issues (filling RAM very quickly, often causing a crash). I've tried to replicate this behaviour in a few examples.
Example 1 - merge folder of rasters:
# make example raster
x = 1000
y <- rast(nrow=x, ncol=x, res=1, vals = sample(0:1, x, replace = TRUE))
writeRaster(y, "y.tif", overwrite = TRUE)
# make list (to emulate list of folder contents)
r.list <- rep("y.tif", 4000)
# read in
q <- lapply(r.list, rast)
# merge
q <- do.call(merge, q) # uses a lot of memory (much more with my real data)
Example 2 - merge folder of shapefiles:
# load vector
s <- system.file("ex/lux.shp", package="terra")
# make list
s.list <- rep(s, 4000)
# make empty SpatVector
v <- vect()
# loop
for(w in 1:length(s.list)){
# add to SpatVector
v <- rbind(v, vect(s.list[[w]]))
} # fills memory
Example 3 - polygonise and merge folder of rasters:
# make empty SpatVector
g <- vect()
# loop
for(d in 1:length(r.list)){
# load
f <- rast(r.list[[d]])
# polygonise
f <- as.polygons(f)
# add to SpatVector
g <- rbind(g, f)
} # fills memory
Please report one case per issue (although 2 and 3 are probably the same), otherwise it becomes more difficult to deal with them. For now:
- two alternative approaches
# make example raster
x = 100
y <- rast(nrow=x, ncol=x, res=1, vals = 1)
writeRaster(y, "y.tif", overwrite = TRUE)
r.list <- rep("y.tif", 4)
# A
q <- lapply(r.list, rast)
q <- sprc(q)
m <- merge(q)
# or B
v <- vrt(r.list)
m <- writeRaster(v, "out.tif")
2 & 3) instead make a list of SpatVector objects and call vect
with that.
s <- system.file("ex/lux.shp", package="terra")
s.list <- rep(s, 4)
v <- lapply(s.list, vect)
v <- vect(v)
Sorry for triple posting, I assumed the issue was for all three was linked. I had found a workaround for all three issues but thought I'd post them here as it seemed like unexpected behaviour. In example 1 I was trying to process a 5MB folder of files, it was using 2GB of RAM, example 2 was a 45MB folder of shapefiles, using 12GB RAM. I thought this seemed like there might be something wrong, hence posting here.
Thanks very much for the suggestions. Some feedback:
# Example 1
# A
q <- lapply(r.list, rast)
q <- sprc(q)
m <- merge(q) ## this fills memory and crashes
# or B
v <- vrt(r.list) ## this works beautifully
m <- writeRaster(v, "out.tif")
Example 2 solution works great as well. Having computor issues at the moment but I think a combination of vrt()
and example 2 solution should work for example 3. Thanks very much.
OK I can't get this to work for example 3. The objective here is to load a folder of rasters, process them, convert to polygons and save as one shapefile. I've tried using vrt()
to load the raster in per your suggestion, which works great but I can't then process them:
r <- vrt(r.list)
r <- classify(r, cbind(-Inf, 0.5, NA), right=FALSE)
I get the Error: [classify] insufficient disk space (perhaps from temporary files?)
. The folder of rasters is apx. 400MB in total. I would have thought the loop approach I originally gave would have been a memory safe way of doing this (and works fine if I use sf
to save the shapefile i.e. making an empty sf
object at the start and changing the last line to g <- rbind(g, st_as_sf(f))
). Why does it crash with writeVector
? Is there a terra
-only way of doing this?
As you are not providing a filename, the lack of disk space would be for your tempdir()
.
`