fix: Fix converting EXAONE when using model_weights_loader
Converting EXAONE without setting TRTLLM_DISABLE_UNIFIED_CONVERTER=1 causes this error.
I used EXAONE-3.5-32B-Instruct, 2.4B and 7.8B.
model_name = hf_model_or_dir does not contain 'exaone'
If the model is EXAONE, config.architecture is always ExaoneForCausalLM.
Error Log
2025-03-10 15:34:22,249 - INFO - flashinfer.jit: Prebuilt kernels not found, using JIT backend
[TensorRT-LLM] TensorRT-LLM version: 0.18.0.dev2025022500
0.18.0.dev2025022500
[03/10/2025-15:34:22] [TRT-LLM] [W] Implicitly setting LLaMAConfig.has_partial_lora_mask = False
[03/10/2025-15:34:22] [TRT-LLM] [W] Implicitly setting LLaMAConfig.tie_word_embeddings = False
5it [00:01, 3.82it/s]
Traceback (most recent call last):
File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 587, in <module>
main()
File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 579, in main
convert_and_save_hf(args)
File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 520, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 527, in execute
f(args, rank)
File "/app/tensorrt_llm/examples/llama/convert_checkpoint.py", line 502, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/llama/model.py", line 502, in from_hugging_face
loader.generate_tllm_weights(model, arg_dict)
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 396, in generate_tllm_weights
self.load(tllm_key,
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 307, in load
v = sub_module.postprocess(tllm_key, v, **postprocess_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/layers/linear.py", line 428, in postprocess
weights = torch.cat(weights)
^^^^^^^^^^^^^^^^^^
TypeError: expected Tensor as element 0 in argument 0, but got NoneType
@byshiue can you help review this MR?
Thanks June
Hi, @byshiue can you check this MR?
If "exaone" is in model_name.lower(), then --model_dir should point to a path that includes "exaone". For example: export HF_MODEL_DIR=hf_models/exaone
On the other hand, if "exaone" is in config.architecture.lower(), then --model_dir doesn't need to be considered, since the model architecture is already specified in config.json.
The same condition is being used here too.
https://github.com/NVIDIA/TensorRT-LLM/blob/eb2d51a42990b8d0b30bc6c29fad4fd491da749f/tensorrt_llm/models/llama/convert.py#L419-L420
Hi, @lkm2835, are you still facing the same issue? When I tested on ToT main, it seems like the issue is gone now.