TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

TensorRT-LLM[Branchv0.12.0-jetson] Quick confirmation: Gemma2 not supported yet?

Open sdecoder opened this issue 9 months ago • 1 comments

Greetings everyone. 0. I am trying to use TensorRT-LLM to deploy Gemma2 LLM on the Jetson AGX Orin platform.

  1. I am going through this instruction: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.12.0-jetson/examples/gemma
  2. I downloaded the genuine checkpoint from huggingface source: huggingface-cli download --resume-download google/gemma-2-27b-it
  3. After run following code:

python3 ./convert_checkpoint.py
--ckpt-type jax
--model-dir /home/nvidia/.cache/huggingface/hub/models--google--gemma-2-27b-it/snapshots/aaf20e6b9f4c0fcf043f6fb2a2068419086d77b0
--dtype bfloat16
--world-size 1
--output-model-dir /home/nvidia/projects/TensorRT-LLM/examples/gemma/gemma-2-27b-it-bf16-cvt-ckpt

I got following error:

python3 ./convert_checkpoint.py
--ckpt-type jax
--model-dir /home/nvidia/.cache/huggingface/hub/models--google--gemma-2-27b-it/snapshots/aaf20e6b9f4c0fcf043f6fb2a2068419086d77b0
--dtype bfloat16
--world-size 1
--output-model-dir /home/nvidia/projects/TensorRT-LLM/examples/gemma/gemma-2-27b-it-bf16-cvt-ckpt

[TensorRT-LLM] TensorRT-LLM version: 0.12.0 Loading source parameters from /home/nvidia/.cache/huggingface/hub/models--google--gemma-2-27b-it/snapshots/aaf20e6b9f4c0fcf043f6fb2a2068419086d77b0 Traceback (most recent call last): File "/home/nvidia/projects/TensorRT-LLM/examples/gemma/./convert_checkpoint.py", line 268, in main() File "/home/nvidia/projects/TensorRT-LLM/examples/gemma/./convert_checkpoint.py", line 201, in main ckpt_params = ckpt_parser.load_parameters(args.model_dir) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/tensorrt_llm/models/gemma/convert.py", line 65, in load_parameters gemma_params.load_params(checkpoint_path))) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/tensorrt_llm/models/gemma/utils/params.py", line 33, in load_params params = checkpointer.restore(path) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/orbax/checkpoint/_src/checkpointers/checkpointer.py", line 289, in restore restored = self._restore(directory, args=ckpt_args) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/orbax/checkpoint/_src/checkpointers/checkpointer.py", line 308, in _restore return self._handler.restore(directory, args=args) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/orbax/checkpoint/_src/handlers/pytree_checkpoint_handler.py", line 803, in restore structure, use_zarr3_metadata = self._get_internal_metadata(directory) File "/home/nvidia/anaconda3/envs/tensorrt-llm/lib/python3.10/site-packages/orbax/checkpoint/_src/handlers/pytree_checkpoint_handler.py", line 959, in _get_internal_metadata raise FileNotFoundError( FileNotFoundError: No structure could be identified for the checkpoint at /home/nvidia/.cache/huggingface/hub/models--google--gemma-2-27b-it/snapshots/aaf20e6b9f4c0fcf043f6fb2a2068419086d77b0.

So I just want to quickly confirm that Gemma2 is not still supported by TensorRT-LLM[Branchv0.12.0-jetson] yet? Thank you very much for any hint or information.

sdecoder avatar Mar 21 '25 16:03 sdecoder

My model conversion failed too (for a different reason), but it's hf checkpoint, isn't it?

anonymousmaharaj avatar Mar 23 '25 22:03 anonymousmaharaj