Arraymancer
Arraymancer copied to clipboard
Shuffling, k-fold and stratified k-fold
Shuffle
Deterministic shuffles are needed in general for both deep learning and machine learning.
In many cases input data might be ordered, for example IMDB is all positive reviews, then all negative reviews. This will make large gradient update when passing from one section to another and skew the final weights to negative while shuffle data will reach a better balance.
K-Fold and Stratified K-Fold
Controlling folds is key to reach the highest accuracy and make sure we don't contaminate our model with target leaks when building complex ensembles.