Adding new model
The below mentioned point where the link for GPTModel is broken and can't get the exact config for the yaml file hence not able to add new model (llama2-13b) as not able to profile the data for that model.
- Add a YAML model config for the new model in data/model_configs.
- Use the model's HuggingFace model id for the file name eg. data/model_configs/meta-llama/Llama-2-70b-hf.yml.
- Refer HuggingFace config.json for the model eg. https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json.
- Ensure that correct parameters are set in the YAML file so that the reference transformer model GPTModel closely resembles the new model.
- We use this reference model to profile only the MLP operations of all the models so the attention operations are no-op'ed here.
Would like to know if there is any solution for this as even the model_configs folder is not present in the base directory.
@anmolagarwalcp810 please share the updated instructions? and let's also make sure that we reflect those in the docs.
Any update on this front ?
Any update on this front ?
Hi @spliii, create a new class for the model you want to add at model_config.py. Earlier we used to use yaml files for config but we shifted to using dataclasses inside python files.
@nitinkedia7 What about adding a new model with MoE or MLA structure, like Deepseek?