open_lm
open_lm copied to clipboard
Fix too many tokens requested edge case.
Sometimes, the model needs to do a few more training steps in a new epoch, and it would load an entire checkpoints worth of data for that. This PR limits the number of tokens requested by how many steps are actually left over.