OLMo icon indicating copy to clipboard operation
OLMo copied to clipboard

Olmo / OLMo consistency

Open Muennighoff opened this issue 1 year ago • 7 comments

🐛 Describe the bug

In the HF code we use OLMo but in training it's Olmo - This creates some inconsistencies when importing from the training modeling file ---- I think we should settle on one for both - wdyt? Probably OLMo as it's the actual name; happy to open a pr (filing as a bug as e.g. _no_split_modules = ["OLMoBlock"] doesn't work i think)

cc @AkshitaB @dirkgr

Versions

latest

Muennighoff avatar Feb 19 '24 14:02 Muennighoff

Please file a PR! In the code, they should all be "OlmoSomething".

dirkgr avatar Feb 20 '24 18:02 dirkgr

Please file a PR! In the code, they should all be "OlmoSomething".

Sure so I'll rename everything in here https://github.com/allenai/OLMo/blob/main/hf_olmo/modeling_olmo.py to Olmo?

Muennighoff avatar Feb 20 '24 18:02 Muennighoff

👍🏻

dirkgr avatar Feb 20 '24 19:02 dirkgr

Sounds good - will just wait for @AkshitaB to give her okay, as I will need to change a bunch of things including the config of the model on the hub https://huggingface.co/allenai/OLMo-7B/blob/main/config.json

Muennighoff avatar Feb 20 '24 19:02 Muennighoff

Huggingface recommended the convention to be OLMoXYZ for the naming of the model. Also, I will recommend not making changes to the model on the hub unless absolutely necessary, as it's not just 1 model, it's ALL the 1000+ checkpoints for all 3 models.

AkshitaB avatar Feb 23 '24 00:02 AkshitaB

Ok, then we'll go with Huggingface's suggestion. That means we rename everything to OLMo*, right?

dirkgr avatar Feb 23 '24 00:02 dirkgr

Fixed it here: https://github.com/allenai/OLMo/pull/466

Muennighoff avatar Feb 25 '24 09:02 Muennighoff