oumi
oumi copied to clipboard
[Feature][Config] Add Tulu3/Olmo2 model configs
Feature request
These are recently released open-source models from AI2. We should have configs for training/evaluation/inference with them. See existing configs under configs/recipes, ex. for Llama 3.1. Also see #1361 for a related feature that adds Tulu3 dataset support.
Motivation / references
List of models we'd like to add: https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B https://huggingface.co/allenai/Llama-3.1-Tulu-3.1-8B https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B https://huggingface.co/allenai/Llama-3.1-Tulu-3-405B
https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct
Your contribution
If somebody can volunteer to start this work, I can answer questions and help with testing.
I did the pull request to add the dataset support. The reason that I didn't include proper configs for training the models was that training the full model would take about 50 hours on 64 GPUs and I don't think I have the resources to do this myself.
There are details on how to reproduce the Tulu 3 models here and I could make configs based on this, but I am not going to be able to actually build and evaluate them properly.
Yep I saw that change, thanks for making it! This is a separate feature request to add configs for the models in Oumi, similar to how we have configs for Llama. This is in case users want to fine-tune these models further, evaluate them, run inference on them, etc.
Sure. I am just saying that the configs from the original project are there.