vidur icon indicating copy to clipboard operation
vidur copied to clipboard

Adding new model

Open samarth1612 opened this issue 1 year ago • 5 comments

The below mentioned point where the link for GPTModel is broken and can't get the exact config for the yaml file hence not able to add new model (llama2-13b) as not able to profile the data for that model.

  1. Add a YAML model config for the new model in data/model_configs.
  • Use the model's HuggingFace model id for the file name eg. data/model_configs/meta-llama/Llama-2-70b-hf.yml.
  • Refer HuggingFace config.json for the model eg. https://huggingface.co/meta-llama/Llama-2-70b-hf/blob/main/config.json.
  • Ensure that correct parameters are set in the YAML file so that the reference transformer model GPTModel closely resembles the new model.
  • We use this reference model to profile only the MLP operations of all the models so the attention operations are no-op'ed here.

Would like to know if there is any solution for this as even the model_configs folder is not present in the base directory.

samarth1612 avatar Sep 02 '24 09:09 samarth1612

@anmolagarwalcp810 please share the updated instructions? and let's also make sure that we reflect those in the docs.

AgrawalAmey avatar Sep 05 '24 19:09 AgrawalAmey

Any update on this front ?

samarth1612 avatar Sep 17 '24 10:09 samarth1612

Any update on this front ?

spliii avatar Jul 13 '25 14:07 spliii

Hi @spliii, create a new class for the model you want to add at model_config.py. Earlier we used to use yaml files for config but we shifted to using dataclasses inside python files.

nitinkedia7 avatar Jul 13 '25 15:07 nitinkedia7

@nitinkedia7 What about adding a new model with MoE or MLA structure, like Deepseek?

zhuango avatar Aug 05 '25 08:08 zhuango