vidur Adding new model

The below mentioned point where the link for GPTModel is broken and can't get the exact config for the yaml file hence not able to add new model (llama2-13b) as not able to profile the data for that model.

Add a YAML model config for the new model in data/model_configs.

Use the model's HuggingFace model id for the file name eg. data/model_configs/meta-llama/Llama-2-70b-hf.yml.
Refer HuggingFace config.json for the model eg. https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json.
Ensure that correct parameters are set in the YAML file so that the reference transformer model GPTModel closely resembles the new model.
We use this reference model to profile only the MLP operations of all the models so the attention operations are no-op'ed here.

Would like to know if there is any solution for this as even the model_configs folder is not present in the base directory.

Sep 02 '24 09:09 samarth1612

@anmolagarwalcp810 please share the updated instructions? and let's also make sure that we reflect those in the docs.

Sep 05 '24 19:09 AgrawalAmey

Any update on this front ?

Sep 17 '24 10:09 samarth1612

Any update on this front ?

Jul 13 '25 14:07 spliii

Hi @spliii, create a new class for the model you want to add at model_config.py. Earlier we used to use yaml files for config but we shifted to using dataclasses inside python files.

Jul 13 '25 15:07 nitinkedia7

@nitinkedia7 What about adding a new model with MoE or MLA structure, like Deepseek?

Aug 05 '25 08:08 zhuango