mlr3
mlr3 copied to clipboard
Allow to define more stratum in `partition` function
Splitting training and test/holdout data could be performed using partition function, but it only allows the stratification on the target variable.
However, I could do the splitting using , for example
task_gc = tsk("german_credit")
task_gc$col_roles$stratum = c("credit_risk", "housing", "telephone")
ho = rsmp("holdout", ratio = 0.8)
split = ho$instantiate(task_gc)
split$instance
Just wondering, can such funationality be brought to partition to define more stratum so that rsmp could be kept for its original purpose for resampling in the development on training data. I think they are essentially doing the same thing.
@mllg Should I add this?
This should already work:
task_gc = tsk("german_credit")
task_gc$col_roles$stratum = c("credit_risk", "housing", "telephone")
split = partition(task_gc, ratio = 0.8)
We should clearly document this better, though.
This should already work:
task_gc = tsk("german_credit") task_gc$col_roles$stratum = c("credit_risk", "housing", "telephone") split = partition(task_gc, ratio = 0.8)We should clearly document this better, though.
Thanks. This does work.
Hello. What is the difference between partition() and rsmp()?