torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

convert_hf_checkpoint only relies on model_name to resolve TransformerArgs

Open Jack-Khuu opened this issue 5 months ago • 0 comments

🐛 Describe the bug

convert_hf_checkpoint transforms a HF checkpoint into a torchchat format.

As part of this process, ModelArgs is created for the newly downloaded model. Currently it constructs ModelArgs based on model_name instead of checking transformer_params_key first:

  • config_args = ModelArgs.from_name(model_name).transformer_args['text']

While this is correct most of the time, model_configs defines a transformer_params_key to allow specifying an alternative model_params.

Task: Update convert_hf_checkpoint to check for a transformer_params_key defined model_params before searching with model_name

  • i.e. Attempt to construct ModelArgs with from_table before from_name

Versions

N/a

Jack-Khuu avatar Sep 23 '24 15:09 Jack-Khuu