landmap icon indicating copy to clipboard operation
landmap copied to clipboard

Spatial CV

Open dirtdude opened this issue 3 years ago • 1 comments

Hi. Working through the example now using my own data. https://gitlab.com/openlandmap/spatial-predictions-using-eml#using-geographical-distances-to-improve-spatial-interpolation

Just curious how the spatial CV is implemented. I'm getting poorer performance in CV versus using CaretEnsemble. This is probably due to the spatial CV, as my points are clustered. I'm getting ~ 0.32 R2 from LandMap, and ~0.5 R2 from caretEnsemble, using a linear combination of base learners. Basically I am wondering if you can tune the spatial CV, and where you can access the geographical distances. From the gitlab "This runs number of steps including derivation of geographical distances" what is under the hood here?

Thanks!

dirtdude avatar Mar 11 '21 20:03 dirtdude

The spatial CV is implemented by spatial cross-validation:

  1. Estimate spatial autocorrelation range of spatial variation in the target variable (cell.size), if possible by fitting a variogram to residuals (see train.spLearner.R).
  2. Use the block size during training of the Ensemble Model (see train.spLearner.R) via the resampling=mlr::makeResampleDesc(method = "CV", blocking.cv=TRUE) argument.

I think the ~ 0.32 R-square is the one you should report. Read more about spatial CV.

thengl avatar Mar 12 '21 15:03 thengl