resamplr icon indicating copy to clipboard operation
resamplr copied to clipboard

Feature Request: Support resampling of resample objects

Open wdearden opened this issue 7 years ago • 1 comments

I have a data frame where I am running models on well-overlapping subsets of the data frame. Within each subset I am then using time series cross-validation. It would be very inefficient to store separate copies of each subset of the df.

As a reproducible example:

n <- 10 

df <- tibble(
    x = 1:n,
    y = 2*1:n
)

samples <- resample_df(df, map(1:n, ~ setdiff(1:n, .)))

samples has the well overlapping subsets of the data frame. Then I can run the time series cross-validation on each subset with

samples_crossv <- samples %>% 
    mutate(sample = map(sample, ~ as.data.frame(.) %>% crossv_ts()))

However, this loses the pointer to the original data frame. You can see this by creating:

samples_dfs <- mutate(samples, sample = map(sample, as.data.frame))

Then compare object_size(samples) and object_size(samples_dfs). My data frame is wide enough and the overlapping between subsets is enough that this would be a very useful feature.

wdearden avatar May 19 '17 15:05 wdearden

Sorry for the delay, that's completely doable, and should be pretty easy. All the resampling functions have a non-exported versions that take the length of the vector as an input, and return an output. I plan on doing one more big rewrite of this package soon (next week), and then submit to CRAN

jrnold avatar Jun 15 '17 18:06 jrnold