lit-llama
lit-llama copied to clipboard
Fixing the pretrain script for Loss Averaging and no_backward_sync()
I think #357 should be applied to the pretrain script as well.
Thank you so much lightning team for this amazing repository.
The red pajama pretrain script has gradient accumulation already. Shakespeare is missing it, it could be added too there yes. Contributions welcome!
I'm not yet familiar with GitHub and the code editing, but I'm eager to learn and help out! I'll try to learn and edit Shakespeare code