Adam I
Results
1
issues of
Adam I
This PR aims to add replay to GPT-NeoX. I had implemented this for the paper [Simple and Scalable Strategies to Continually Pre-train Large Language Models](https://arxiv.org/abs/2403.08763) that shows simple ways to...