rsample
rsample copied to clipboard
Function for returning the original data frame with fold assignments appended?
Feature
Over on spatialsample, there have been a few requests (https://github.com/tidymodels/spatialsample/issues/158, https://github.com/tidymodels/spatialsample/issues/157) for a function that basically works like this:
library(rsample)
library(magrittr)
library(generics)
augment.rset <- function(rset, ..., fold_column = "fold") {
purrr::list_rbind(
purrr::map(
seq_len(nrow(rset)),
function(fold) {
fold_members <- get_rsplit(rset, fold) %>%
assessment()
fold_members[[fold_column]] <- fold
fold_members
}
)
)
}
vfold_cv(Orange) %>%
augment()
#> Tree age circumference fold
#> 1 1 1004 115 1
#> 2 3 1231 115 1
#> 3 5 1004 125 1
#> 4 5 1231 142 1
#> 5 2 118 33 2
#> 6 2 1231 172 2
#> 7 4 664 112 2
#> 8 5 484 49 2
#> 9 1 118 30 3
#> 10 2 1582 203 3
#> 11 3 118 30 3
#> 12 4 1231 179 3
#> 13 1 484 58 4
#> 14 1 1582 145 4
#> 15 4 1004 167 4
#> 16 5 118 30 4
#> 17 1 664 87 5
#> 18 2 1004 156 5
#> 19 3 484 51 5
#> 20 5 1372 174 5
#> 21 1 1372 142 6
#> 22 2 664 111 6
#> 23 4 1372 209 6
#> 24 1 1231 120 7
#> 25 3 1582 140 7
#> 26 4 118 32 7
#> 27 2 1372 203 8
#> 28 3 1372 139 8
#> 29 5 664 81 8
#> 30 2 484 69 9
#> 31 3 1004 108 9
#> 32 5 1582 177 9
#> 33 3 664 75 10
#> 34 4 484 62 10
#> 35 4 1582 214 10
Created on 2024-03-06 with reprex v2.0.2
I think this is wanted both as an "escape hatch" from spatialsample, to go and use these CV objects with models that aren't (yet?) built into the tidymodels framework, and to make it easier to visualize fold assignments. The above is basically how autoplot.spatial_rset
gets fold assignments for its own visualizations.
Would it make sense to add a function like this to rsample?
Thinking about this for a second longer -- the implementation above wouldn't work with repeated CV (or nested, I think)