Olmo / OLMo consistency
🐛 Describe the bug
In the HF code we use OLMo but in training it's Olmo - This creates some inconsistencies when importing from the training modeling file ---- I think we should settle on one for both - wdyt? Probably OLMo as it's the actual name; happy to open a pr
(filing as a bug as e.g. _no_split_modules = ["OLMoBlock"] doesn't work i think)
cc @AkshitaB @dirkgr
Versions
latest
Please file a PR! In the code, they should all be "OlmoSomething".
Please file a PR! In the code, they should all be "OlmoSomething".
Sure so I'll rename everything in here https://github.com/allenai/OLMo/blob/main/hf_olmo/modeling_olmo.py to Olmo?
👍🏻
Sounds good - will just wait for @AkshitaB to give her okay, as I will need to change a bunch of things including the config of the model on the hub https://huggingface.co/allenai/OLMo-7B/blob/main/config.json
Huggingface recommended the convention to be OLMoXYZ for the naming of the model. Also, I will recommend not making changes to the model on the hub unless absolutely necessary, as it's not just 1 model, it's ALL the 1000+ checkpoints for all 3 models.
Ok, then we'll go with Huggingface's suggestion. That means we rename everything to OLMo*, right?
Fixed it here: https://github.com/allenai/OLMo/pull/466