MLDataPattern.jl
MLDataPattern.jl copied to clipboard
Add function separateobs
trafficstars
Based on a suggestion from @oxinabox I agree it would be a good idea to introduce a function called separateobs (or something of that sorts). It should have a similar interface to stratifiedobs, but in contrast to it, it will try to split the data such that there are no overlaps in the labels between the separated subsets.
For example in handwriting recognition one wouldn't necessarily want samples from the same author in both the training and the test set.