ibis-ml icon indicating copy to clipboard operation
ibis-ml copied to clipboard

docs(website): explain how users can perform train-test splitting with Ibis

Open jitingxu1 opened this issue 1 year ago • 3 comments

randomly partition a dataset into subsets while ensuring reproducibility

Reference:

  • https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
  • https://rsample.tidymodels.org/reference/initial_split.html

jitingxu1 avatar Apr 11 '24 23:04 jitingxu1

https://github.com/ibis-project/ibis-ml/pull/60 implements a basic (approximate) train-test split using Ibis. It could be very nice to wrap this up as part of a utility in IbisML, but it would be the only non-Step utility at this time. Maybe it's sufficient to just show it in the demo notebook for now? I'm not sure.

@lostmygithubaccount do you think users would really like to have this utility exposed directly, or it would increase the value prop? Happy to make it P0 then.

deepyaman avatar Apr 16 '24 20:04 deepyaman

demonstrating how it's done w/ sufficient explanation seems fine for now

lostmygithubaccount avatar Apr 16 '24 20:04 lostmygithubaccount

demonstrating how it's done w/ sufficient explanation seems fine for now

Updated the issue title to reflect this.

deepyaman avatar May 07 '24 18:05 deepyaman

Explanation is in the tutorial; @jitingxu1 created https://github.com/ibis-project/ibis-ml/pull/124 with an implementation, so let's close this as completed.

deepyaman avatar Jul 01 '24 18:07 deepyaman