aniMotum
aniMotum copied to clipboard
Bathymetry constraint function
Problem Many coastal species have erroneous positions or, more frequently, predicted locations from the model fit are on land.
Solution example.zip Is there a way to implement a constraint function that prevents land from even being an option? I've done this in adehabitatLT when generating simulated tracks like:
# consfun.r and internal data .rda object containing bdf attached as example.zip
# note bdf is passed to consfun as par argument and is a SpatialPixelsDataFrame based on bathymetry, where 1 is possible (water) and NA is not (land)
source('consfun.r')
load('sysdata.rda')
library(adehabitatLT); library(foieGras)
# filter, fit and predict for ellie
data(ellie)
bathy <- raster::raster(bdf) # make bathy raster
ellie$bathy <- raster::extract(bathy, cbind(ellie$lon, ellie$lat))
ellie <- filter(ellie, bathy == 1) # get rid of points where ellie is on land
fit <- fit_ssm(ellie[,1:8], model = "rw", time.step = 24)
fls <- foieGras::grab(fit, 'pred', as_sf=F) # need regular time res for ltraj
# build ltraj object
tr1 <- adehabitatLT::as.ltraj(cbind(fls$lon, fls$lat), date=fls$date, id=as.factor(fls$id),
proj4string = sp::CRS("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"))
crw <- adehabitatLT::NMs.randomCRW(na.omit(tr1), rangles=TRUE, rdist=TRUE,
constraint.func = consfun,
constraint.par = bdf, nrep = 10)
# simulate
tmp <- adehabitatLT::testNM(crw)
# add unique ids for each sim track for plotting
sim_out <- tmp[[1]] %>% do.call(rbind, .)
sim_out$id <- NA
idx <- c(1, which(diff(sim_out$date) < 1) + 1)
for (i in 1:(length(idx) - 1)) sim_out$id[c(idx[i]:(idx[i + 1] - 1))] <- i
# plot
data(countriesHigh, package = "rworldxtra", envir = environment())
wm <- suppressMessages(fortify(countriesHigh))
rm(countriesHigh)
xl <- extendrange(fls$lon, f = 0.1)
yl <- extendrange(sim_out$y, f = 0.1)
p1 <- ggplot() + geom_polygon(data = wm, aes_string(x = "long", y = "lat", group = "group"), fill = grey(0.3)) +
coord_cartesian(xlim = xl, ylim = yl) + xlab("Longitude") + ylab("Latitude")
p1 <- p1 + geom_path(data = sim_out, aes_string(x = "x", y = "y", group = "id"))
p1 <- p1 + geom_point(data = fls, aes_string(x = 'lon', y = 'lat', group = NULL))
p1
Something like this is in the works but so far it's proving to be rather computationally demanding. I'll have a look through your code to see if it spawns some new ideas. Or, if you want to take a stab you're welcome to contribute!
We do this pretty heavily in tripEstimation/SGAT, perhaps we can help speed it up?
Even a large grid like Etopo2 doesn't seem to be too slow, is this fast enough?
## generate rando coordinates upfront
ra <- lapply(1:1000, function(i) geosphere::randomCoordinates(2500))
library(raster)
#> Loading required package: sp
## make a function to encapsulate the grid
## so we can pass this into model funs
mkMask <- function(x) {
## use brick() for annoying technical reasons
grid <- raster::brick(x < 0)
function(pts) {
v <- raster::extract(grid, pts)
## any NA values are outside grid range
v[is.na(v)] <- FALSE
v
}
}
## a very low res etopo of only southern
## (we can get a better mask)
maskfun <- mkMask(quadmesh::etopo)
plot(ra[[300]], col = maskfun(ra[[300]]) + 1, ylim = c(-90, 90))
system.time({
for (i in seq_along(ra)) {
tst <- maskfun(ra[[i]])
}
})
#> user system elapsed
#> 0.887 0.020 0.907
## what if we use a much higher-res grid
maskfun2 <- mkMask(raadtools::readtopo("etopo2"))
plot(ra[[300]], col = maskfun2(ra[[300]]) + 1, ylim = c(-90, 90))
system.time({
for (i in seq_along(ra)) {
tst <- maskfun2(ra[[i]])
}
})
#> user system elapsed
#> 0.976 0.000 0.976
Created on 2019-04-23 by the reprex package (v0.2.1)
I reckon that hard part is actually getting a data set of reasonable resolution for the given model, but we could encode a masked Etopo2 or Etopo1 into this package, it's tiny as a compressed mask (much bigger in memory of course):
topo <- raadtools::readtopo("etopo2") <= 0
## attached here as a .zip, about 0.5 Mb
writeRaster(topo, "etopo2-oceanmask.tif", datatype = "INT1U", options = "COMPRESS=DEFLATE")
(Apologies for use of raadtools, which is not available without preparation, but I'm happy to provide data in any way). etopo2-oceanmask.zip
Yeah, that's not really the hard part - though your raster wrangling will be much faster than mine. The hard bit is appropriately penalizing the joint log-likelihood, without adding too much extra computation time, so the optimal solution doesn't include locations on land. I know you and Spoon wrote your own MCMC in tripEstimation, is SGAT similar or does it use an ML approach?
Oh I see, sorry - it's similar in SGAT. It was easier to ensure that candidate locations passed the mask upfront which was always a hassle. I tried a likelihood-mask on distance to coast so that points would be nudged towards the ocean but it never worked well. I'll have a closer look
"...but it never worked well." sounds sooo familiar in this case!
@camrinbraun the preferred option as of v 1.0-5 is to use route_path()
after fitting an SSM to track data. This leverages @jmlondon's pathroutr
pkg https://github.com/jmlondon/pathroutr to reroute locations that are erroneously on land. Currently, route_path()
uses the rnaturalearth
medium or high-res polygon land data but for a future update I hope to explore additional options such as bathymetry strata.
Here's an example:
Closing this as route_path()
and potential function options in sim_fit()
are intended to provide a land constraint, which could be adapted by users to impose an arbitrary bathymetric constraint - though this might require users to engage more directly with the pathroutr
package (route_path()
is just a wrapper function for pathroutr
).