MLBase.jl icon indicating copy to clipboard operation
MLBase.jl copied to clipboard

Add random train-test splitting

Open abbradar opened this issue 8 years ago • 1 comments

It can be implemented via sample family of functions from StatsBase. Example implementation with sklearn-like interface is here. If it's okay I can make a PR; what holds me from it is that I'm a newcomer and may have just missed an already existing and obvious way to do it.

EDIT: also a nice addition would be to support several arrays simultaneously -- I'll work on this if it's accepted to be useful.

abbradar avatar May 13 '16 19:05 abbradar

I think the sample function in StatsBase doesn't allow a user to specify through which dimensions to take a sample from. So in practice, it's only useful for 1-dimensional arrays.

bobbywlindsey avatar Jul 28 '16 04:07 bobbywlindsey