llm-foundry Converting a composer seq2seq t5 model throws an exception

Converting a composer seq2seq t5 model throws an exception

Open timsteuer opened this issue 1 year ago • 3 comments

Environment

llm-foundry: latest

To reproduce

Steps to reproduce the behavior:

train a hf_t5 model
download the composer checkpoint
try to convert it back to huggingface via scripts/inference/convert_composer_to_hf.py
The script crashes when trying to load the saved model as AutoModelForCausalLM

Expected behavior

The model is saved as a HuggingFace snapshot without any issue

Additional context

Locally, I fixed this via simply loading with AutoModel and not via AutoModelForCausalLM. I guess this is fine.

Nov 21 '23 07:11 timsteuer

Ah yes, that script only support causal lms right now. A note on your solution, I'm not certain, but AutoModel here may give you a T5Model rather than a T5ForConditionalGeneration as you may want. Probably worth double checking that.

Nov 21 '23 08:11 dakinggg

That was an interesting hint.

Just double checked and the model was indeed marked as a T5Model and not as a T5ForConditionalGeneration.

So I changed that in the conversion script, such that it yields the right config. However, loading the final model via AutoModel still results in a T5Model even though the config now explicitly states the correct model type.

On the other hand, if I load via AutoModelForSeq2SeqLM it loads the lm_head. So, I guess that is a HF specific thing and not related to the conversion script per se.

Nov 21 '23 10:11 timsteuer

Yeah, AutoModel generally gives you the backbone model, while the AutoModelForXYZ will give you the model with adaptation/head for XYZ.

Nov 21 '23 17:11 dakinggg

llm-foundry llm-foundry copied to clipboard

Converting a composer seq2seq t5 model throws an exception

Environment

To reproduce

Expected behavior

Additional context

llm-foundry
llm-foundry copied to clipboard