TensorRT-LLM Support Gemma 1.1 model

Support Gemma 1.1 model

Open ttim opened this issue 7 months ago • 5 comments

Model: https://huggingface.co/google/gemma-1.1-2b-it

@byshiue

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Use GemmaForCausalLM.from_hugging_face().save_checkpoint() API with for https://huggingface.co/google/gemma-1.1-2b-it model, this fails for 1.1 model but succeeds for 1.0 model (https://huggingface.co/google/gemma-2b-it)
Use trt-llm build tool to build an engine, this fails for 1.0 model

Successfully working TRT-LLM engine

Either checkpoint (for 1.1 version) or engine (for 1.0 version) build fails

I believe issue for 1.1 comes from gelu_pytorch_tanh activation function, I'm not sure what breaks build for 1.0

Jul 04 '24 03:07 ttim