terra
terra copied to clipboard
Conduct focal over part of a SpatRaster [FEATURE REQUEST]
I'm working on parallelizing focal operations by breaking large rasters into smaller chunk/tiles by row and then sending out each chunk to a different core (ailich/GLCMTextures#25). However, since focal operations require the context of surrounding cells, the first/last row of a chunk that I want to calculate values for requires information from the rows above/below it (I'll call these buffer rows). Conducting focal
on each of these chunks, trimming them accordingly and merging them back together works; however, since focal
goes through all the cells in a raster, the focal calculations are conducted on these buffer rows that themselves do not have the needed context for focal calculations. These are later overwritten by values calculated with the full focal context when merging chunks since there is overlap between chunks. Since these values will be overwritten, there is no need to calculate them and with larger window sizes, more chunks/cores, and more columns, this added computational burden increases. I noticed focalValues
you can specify where to start and and end extracting values via row
and nrow
. I was wondering if something similar could be implemented in focalCpp
and focal
where focal operations would only be conducted for a portion of the raster and things beyond the specified range would just be filled with NA
or a specified fill value.
One way to achieve that could perhaps be by setting windows. I have added a new function getTileExtents
that gives you a set of windows (extents). You can use argument "buffer" to add additional rows and columns. Your output would then be tiles that can be combined with vrt
. I am trying to do something generic, rather then something specific for focal
/focalCpp
, if possible.
Thanks @rhijmans, that definitely makes creating the tiles easier, though a bit more description what y
does based on the input type would be useful, and potentially being able to set the desired row and columns as a vector of length two rather than a single number. I'm not sure I'm fully understanding but I believe with this method the areas of overlap will still be processed twice which adds computation time. Additionally, the tiles will have to be trimmed back before merging or else that results in striping artifacts. If buffer cells are not evaluated in focal
and are filled with NA
merge can be used without trimming since it will always prioritize non-NA values.
library(terra)
#> Warning: package 'terra' was built under R version 4.3.2
#> terra 1.7.70
r <- rast(ncols=100, nrows=100)
values(r) <- 1:ncell(r)
x <- rast(ncols=10, nrows=10)
w<- c(3,3) # window size
filename <- paste0(tempfile(), "_.tif")
ff <- makeTiles(r, x, buffer= (w[1]-1)/2, filename)
fname<- vector(mode = "character", length= length(ff))
for (i in 1:length(ff)) {
fname[i]<- tempfile(fileext = ".tif")
focal(rast(ff[i]), filename=fname[i], fun=mean, na.rm=TRUE)
}
z_ref<- focal(r, fun=mean, na.rm=TRUE) #Reference of correct surface
z_vrt<- vrt(fname) #Combine with vrt
z_merge<- do.call(merge, lapply(fname, rast)) #Combine with merge
plot(z_ref-z_vrt, main = "Reference - VRT")
plot(z_ref-z_merge, main = "Reference - Merge")
plot(z_merge-z_vrt, main="Merge - VRT")
Created on 2024-01-30 with reprex v2.0.2
a bit more description what y does based on the input type would be useful
I have added a little bit
being able to set the desired row and columns as a vector of length two rather than a single number
That was already possible for both y
and buffer
I believe with this method the areas of overlap will still be processed twice which adds computation time.
That is correct
Additionally, the tiles will have to be trimmed back before merging or else that results in striping artifacts. If buffer cells are not evaluated in focal and are filled with NA merge can be used without trimming since it will always prioritize non-NA values.
Yes, but it would easy to use crop on the tile extents created the same way, but with buffer=0
It would also be possible to add a "cores" argument to focal
(like in e.g. app
).
I think that would also be possible for focalCpp
. You would have to provide the uncompiled version of the cpp function so that it can be compiled on each node.
Thanks @rhijmans. Using crop
and buffer=0
seems to mostly work. Getting some issues on the left and right side of the raster.
library(terra)
#> Warning: package 'terra' was built under R version 4.3.2
#> terra 1.7.71
r <- rast(ncols=100, nrows=100)
values(r)<- 1:ncell(r)
x <- rast(ncols=10, nrows=10)
w<- c(3,3) # window size
ff <- makeTiles(r, x, buffer= (w-1)/2, filename= paste0(tempfile(), "_.tif"))
tile_exts_names<- makeTiles(r, x, buffer= 0, filename= paste0(tempfile(), "_.tif"))
tile_exts<- lapply(lapply(tile_exts_names, rast), ext)
invisible(file.remove(tile_exts_names)) #Clear up disk space because only need extents of these rasters
fname<- vector(mode = "character", length= length(ff))
for (i in 1:length(ff)) {
fname[i]<- tempfile(fileext = ".tif")
crop(focal(rast(ff[i]), fun=mean, na.rm=TRUE), y = tile_exts[[i]], filename=fname[i])
}
z_ref<- focal(r, fun=mean, na.rm=TRUE) #Reference of correct surface
z_vrt<- vrt(fname) #Combine with vrt
z_merge<- do.call(merge, lapply(fname, rast)) #Combine with merge
plot(z_ref!=z_vrt, main = "Reference vs VRT")
plot(z_ref!=z_merge, main = "Reference vs Merge")
Created on 2024-01-31 with reprex v2.0.2
@rhijmans, having cores built into focal
and focalCpp
would be excellent if possible! I'm not sure about the internals of those functions but I suspect they call focalValues
. If so, if tiled by row then using focalValues(tile, row=buffer+1, nrow=nrow(tile)-2*buffer)
could be used to avoid performing focal operations twice in the buffer areas. This would apply for all interior tiles. The top would need to start at the top using row=1
and the end tile would need to end at the bottom using nrow= nrow(tile)-buffer
.
library(terra)
#> Warning: package 'terra' was built under R version 4.3.2
#> terra 1.7.71
r <- rast(ncols=99, nrows=99)
values(r)<- rep(1:nrow(r), each=ncol(r)) #Values correspond to row numbers
w<-c(3,3)
buffer<- (w[1]-1)/2
ff <- makeTiles(r, c(33,ncol(r)), buffer= (w[1]-1)/2, filename= paste0(tempfile(), "_.tif"))
length(ff) # 3 tiles
#> [1] 3
#With no buffer, each tile should be 33 rows
tile2<- rast(ff[2]) #Middle tile
fv<- focalValues(tile2, row=buffer+1, nrow= nrow(tile2)-(2*buffer)) # Start at row after buffer on top, and number of rows based on total in tile minus top and bottom buffer
cv<- fv[,ceiling(prod(w)/2)] #Center value of each window
min(cv) # Start from row 34 b/c tile 1 goes
#> [1] 34
max(cv) # End at row 66 b/c tile 3 starts at 67
#> [1] 66
Created on 2024-01-31 with reprex v2.0.2
This would require a bit of an overhaul to focal
/focalCpp
but the issue of overlap could be eliminated by focal
being redefined such that it first creates a new SpatRaster
object where each layer is an element in the focal window and then it is just a call to app
. Then since each cell has all the information it needs, you could simply pass the values and function to app
and tiling could be done without any buffer. Not sure how this would affect performance though.
library(terra)
#> Warning: package 'terra' was built under R version 4.3.2
#> terra 1.7.71
r <- rast(ncols=100, nrows=100)
values(r)<- 1:ncell(r) #Values correspond to row numbers
w<-c(3,3)
focal_rast<- rast(r) #Initialize raster
nlyr(focal_rast)<- prod(w) # Make number of layers equal to number of elements in focal window
fv<- focalValues(r) # Get focal values (not memory safe)
values(focal_rast)<- fv # Each cell contains values from focal window corresponding to that cell in original raster
fv[1,]
#> [1] NA NA NA 100 1 2 200 101 102
focal_rast[1,1,] #Values across layers correspnd to first row of fv
#> lyr.1 lyr.2 lyr.3 lyr.4 lyr.5 lyr.6 lyr.7 lyr.8 lyr.9
#> 1 NA NA NA 100 1 2 200 101 102
z_ref<- focal(r, w=w, fun=mean, na.rm=TRUE) #Reference of correct surface
z_app<- app(focal_rast, fun="mean", na.rm=TRUE)
plot(z_ref==z_app)
# Tile focal_rast
x <- rast(ncols=10, nrows=10)
ff <- makeTiles(focal_rast, x, buffer= 0, filename= paste0(tempfile(), "_.tif")) #Tile focal_rast with no buffer because cells include the information needed across layers
fname<- vector(mode = "character", length= length(ff))
for (i in 1:length(ff)) {
fname[i]<- tempfile(fileext = ".tif")
app(rast(ff[i]), fun="mean", na.rm=TRUE, filename=fname[i]) # apply function using app
}
z_app_vrt<- vrt(fname) #Combine with vrt
z_app_merge<- do.call(merge, lapply(fname, rast)) #Combine with merge
plot(z_app_vrt!=z_ref)
plot(z_app_merge!=z_ref)
Created on 2024-01-31 with reprex v2.0.2
A simple version of this wrapped into a function would be
focal2<- function(x,w,fun, ...){
focal_rast<- rast(x) #Initialize raster
nlyr(focal_rast)<- prod(w) # Make number of layers equal to number of elements in focal window
fv<- focalValues(r) # Get focal values (not memory safe)
values(focal_rast)<- fv # Each cell contains values from focal window corresponding to that cell in original raster
out<- app(focal_rast, fun=fun,...)
return(out)
} #Alternative focal function
@rhijmans, perhaps a more general option would be to allow for focal
and focalCpp
to take a SpatRaster
of 1s and 0's indicating which cells in the raster to iterate over. In addition to allowing for calculations to be skipped in areas of overlap while chunking rasters, this would also speed up calculations for non-rectangular data where much of the raster grid can be filled with NA's.
library(terra)
#> Warning: package 'terra' was built under R version 4.3.2
#> terra 1.7.71
r<- rast(volcano,
extent= ext(2667400, 2667400 + ncol(volcano)*10,
6478700, 6478700 + nrow(volcano)*10),
crs = "EPSG:27200")
AOI_shp<- vect(structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2667400.28, 2667770.32,
2667850.36, 2667990.72, 2667401.44, 2667400.28, 6479302.96, 6479569.76,
6479268.16, 6478699.76, 6478891.16, 6479302.96, 0, 0, 0, 0, 0,
0), dim = 6:5, dimnames = list(NULL, c("geom", "part", "x", "y",
"hole"))), type="polygons", crs=crs(r))
r<- mask(r, AOI_shp)
r<- crop(r, AOI_shp)
plot(r)
# 37 percent of cells are NA
round(100*(freq(r, value=NA)$count/ncell(r)))
#> [1] 37
AOI<- rast(r, vals=1)
AOI<- mask(AOI, AOI_shp, updatevalue=0)
plot(AOI) #Maybe focal/focalCpp could use this as an index band to know which cells to calculate over and which ones to skip
Created on 2024-03-19 with reprex v2.0.2