TensorRT-LLM
TensorRT-LLM copied to clipboard
Support Gemma 1.1 model
System Info
Model: https://huggingface.co/google/gemma-1.1-2b-it
Who can help?
@byshiue
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
- Use
GemmaForCausalLM.from_hugging_face().save_checkpoint()
API with for https://huggingface.co/google/gemma-1.1-2b-it model, this fails for 1.1 model but succeeds for 1.0 model (https://huggingface.co/google/gemma-2b-it) - Use trt-llm build tool to build an engine, this fails for 1.0 model
Expected behavior
Successfully working TRT-LLM engine
actual behavior
Either checkpoint (for 1.1 version) or engine (for 1.0 version) build fails
additional notes
I believe issue for 1.1 comes from gelu_pytorch_tanh
activation function, I'm not sure what breaks build for 1.0