datatable
datatable copied to clipboard
Support stratified kfold
It would be very useful to support Stratified KFold which is way more commonly used than a random KFold.
Typically it'd look something like:
import datatable as dt
from datatable.models import kfold_stratified
X = dt.Frame(...)
stratified_splits = kfold_stratified(by=X['group'], nsplits=5, seed=1234)
where X is a frame having a column group by which the stratification is done.