parfit
parfit copied to clipboard
Using Parfit with Custom CV split
Currently, cross validation in parfit can be performed by specifying n_folds
. Is there a possibility for providing a functionality for the user to specify the CV splits manually via index?
Or even better, passing in general Sklearn splitter objects?
Thanks!
Motivation
One possible use-case is when trying to do CV for a time-series dataset, where the usual CV split is not suitable because of the causality inherent in the data. The general consensus, then, seems to be to do the CV split like:
assume we have data in 5 blocks: [1,2,3,4,5]
split 1: train: [1], val: [2]
split 2: train: [1,2], val: [3]
split 3: train: [1,2,3], val: [4]
split 4: train: [1,2,3,4], val: [5]
This is implemented in Sklearn as TimeSeriesSplit: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html