oumi icon indicating copy to clipboard operation
oumi copied to clipboard

[Feature][Config] Add Tulu3/Olmo2 model configs

Open wizeng23 opened this issue 9 months ago • 3 comments

Feature request

These are recently released open-source models from AI2. We should have configs for training/evaluation/inference with them. See existing configs under configs/recipes, ex. for Llama 3.1. Also see #1361 for a related feature that adds Tulu3 dataset support.

Motivation / references

List of models we'd like to add: https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B https://huggingface.co/allenai/Llama-3.1-Tulu-3.1-8B https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B https://huggingface.co/allenai/Llama-3.1-Tulu-3-405B

https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct

Your contribution

If somebody can volunteer to start this work, I can answer questions and help with testing.

wizeng23 avatar Feb 07 '25 08:02 wizeng23

I did the pull request to add the dataset support. The reason that I didn't include proper configs for training the models was that training the full model would take about 50 hours on 64 GPUs and I don't think I have the resources to do this myself.

There are details on how to reproduce the Tulu 3 models here and I could make configs based on this, but I am not going to be able to actually build and evaluate them properly.

bwalshe avatar Feb 12 '25 10:02 bwalshe

Yep I saw that change, thanks for making it! This is a separate feature request to add configs for the models in Oumi, similar to how we have configs for Llama. This is in case users want to fine-tune these models further, evaluate them, run inference on them, etc.

wizeng23 avatar Feb 12 '25 19:02 wizeng23

Sure. I am just saying that the configs from the original project are there.

bwalshe avatar Feb 12 '25 20:02 bwalshe