tinn icon indicating copy to clipboard operation
tinn copied to clipboard

Fix - Use Fisher-Yates for uniform shuffling

Open aafulei opened this issue 5 months ago • 0 comments

Hi @glouw first thank you very much for this elegant codebase 👍 It's truly like a piece of art to me!

However, a small issue with the shuffle algorithm: the current implementation does not produce a uniform random permutation. See here and Wikipedia: Fisher–Yates shuffle for reference.

I made a one-line change to improve with the standard Fisher-Yates algorithm, which should make the shuffling uniform. I hope this helps make the code even better. Thanks again for your great work!

Summary

This PR updates the shuffle function to implement the Fisher-Yates algorithm, ensuring a uniform random shuffle of the data.

Details

  • Changed the random index calculation to select from the current index a to the end of the array.
  • This fixes the bias in the previous shuffle implementation.
  • The change is minimal, modifying only one line for correctness.

Impact

  • Ensures all permutations are equally likely.
  • Improves reliability of training data shuffling in the neural network.

Please review and merge if everything looks good. Thanks!

aafulei avatar May 23 '25 15:05 aafulei