How many tokens were trained for 7B model.

Open mathfinder opened this issue 1 year ago • 1 comments

The paper and the readme both say that 2.5 T tokens were trained. However, the corresponding config says 2 T tokens. ReadMe: https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/README.md?plain=1#L49 Config:

https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/configs/official/OLMo-7B.yaml#L74C1-L74C13

Jun 10 '24 14:06 mathfinder

We had to make configuration tweaks mid-run in order to do training for more than 1 epoch (https://github.com/allenai/OLMo/issues/584). The 2.5T token count is accurate.

Jun 17 '24 22:06 2015aroras

Hi, thanks again for the inquiry! We’re currently working on closing out old tickets, so we’re closing this out for now, but if you require a follow-up response, please re-open and we will get back to you!

Jul 01 '25 17:07 baileykuehl