FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Error in covert a t5 model into ft

Open Liudeep opened this issue 3 years ago • 1 comments

I covert a t5 model into ft using 't5_ckpt_convert.py' and one param covert failed: [ERROR] cannot find key 'lm_head.weight', this param id defined in T5ForConditionalGeneration in huggingface and it generate the decodeing logits in each step. It works in triton backend, but the outputs is not same in t5 model.

Liudeep avatar Jun 07 '22 07:06 Liudeep

We don't see the key "lm_head.weight" in the t5 model we test. The models we test are standard T5 like https://huggingface.co/t5-small. I guess the name of lm_head in the checkpoint we test is shared. If you use different method to build the T5 model and lead to different key, you need to modify the key mapping in the converter.

byshiue avatar Jun 07 '22 09:06 byshiue

Hi, Liudeep. The lm_head.weight converting is added in converter of latest release. https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/t5/utils/huggingface_t5_ckpt_convert.py

byshiue avatar Aug 16 '22 03:08 byshiue

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.

byshiue avatar Sep 08 '22 07:09 byshiue