FasterTransformer Error in covert a t5 model into ft

I covert a t5 model into ft using 't5_ckpt_convert.py' and one param covert failed: [ERROR] cannot find key 'lm_head.weight', this param id defined in T5ForConditionalGeneration in huggingface and it generate the decodeing logits in each step. It works in triton backend, but the outputs is not same in t5 model.

Jun 07 '22 07:06 Liudeep

We don't see the key "lm_head.weight" in the t5 model we test. The models we test are standard T5 like https://huggingface.co/t5-small. I guess the name of lm_head in the checkpoint we test is shared. If you use different method to build the T5 model and lead to different key, you need to modify the key mapping in the converter.

Jun 07 '22 09:06 byshiue

Hi, Liudeep. The lm_head.weight converting is added in converter of latest release. https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/t5/utils/huggingface_t5_ckpt_convert.py

Aug 16 '22 03:08 byshiue

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

Sep 08 '22 07:09 byshiue