TensorRT-LLM user trtllm-serve error

System Info

RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback): /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

Who can help?

No response

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

trtllm-serve error Traceback (most recent call last): File "/usr/local/bin/trtllm-serve", line 5, in from tensorrt_llm.commands.serve import main File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/init.py", line 33, in import tensorrt_llm.models as models File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/init.py", line 16, in from .bert.model import (BertForQuestionAnswering, File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/bert/model.py", line 32, in from .convert import (load_hf_bert_base, load_hf_bert_cls, load_hf_bert_qa, File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/models/bert/convert.py", line 23, in from transformers import (BertPreTrainedModel, RobertaPreTrainedModel) File "", line 1412, in _handle_fromlist File "/usr/local/lib/python3.12/dist-packages/transformers/utils/import_utils.py", line 1956, in getattr value = getattr(module, name) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/transformers/utils/import_utils.py", line 1955, in getattr module = self._get_module(self._class_to_module[name]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/transformers/utils/import_utils.py", line 1969, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback): /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE root@VM-50-46-tencentos:/mnt/data2/dyd/tenorrt_0506# pip list|grep transformers

Expected behavior

d

actual behavior

?

additional notes

d

May 16 '25 07:05 SafeCool

Which release are you working with? Can you try out with our main? If there's still an issue, please share steps to reproduce alongside hardware details.

May 16 '25 23:05 brb-nv

I encountered the same issue.

Environment:

Base image: nvcr.io/nvidia/pytorch:25.04-py3
I ran the following steps:

[ -f /etc/pip/constraint.txt ] && : > /etc/pip/constraint.txt
pip uninstall -y tensorrt
pip install tensorrt-llm
cd TensorRT-LLM/examples/auto_deploy
python build_and_run_ad.py --config '{"model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0"}'

But I got the same error during execution.

May 22 '25 10:05 caoyicheng11

I'm also getting similar error in ngc container nvcr.io/nvidia/pytorch:25.04-py3 with tensorrt_llm-0.19.0

In [1]: import tensorrt_llm
ImportError: /usr/local/lib/python3.12/dist-packages/tensorrt_llm/bindings.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs

May 22 '25 20:05 kHarshit

@lucaslie , do you think you can answer the question about auto deploy?

May 23 '25 00:05 brb-nv

same problem. But 0.20.0rc3 works. I think you should use earlier image of pytorch like nvcr.io/nvidia/pytorch:25.01-py3 (not tested)

May 23 '25 08:05 bnuzhanyu

Thanks, I can resolve it by switching to nvcr.io/nvidia/pytorch:25.01-py3 with TensorRT-LLM version: 0.20.0rc3. However, when I run python build_and_run_ad.py --config '{"model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0"}', I now get an AssertionError: Model Factory AutoModelForCausalLM not found..

Traceback (most recent call last): 
File "/data_8t_1/qby/build_and_run_ad.py", line 135, in <module> 
main() 
File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context 
return func(*args, **kwargs) 
^^^^^^^^^^^^^^^^^^^^^ 
File "/data_8t_1/qby/build_and_run_ad.py", line 97, in main 
llm = build_llm_from_config(config) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/data_8t_1/qby/build_and_run_ad.py", line 60, in build_llm_from_config 
factory = ModelFactoryRegistry.get(config.model_factory)( 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/auto_deploy/models/factory.py", line 154, in get 
assert cls.has(name), f"Model Factory {name} not found." 
^^^^^^^^^^^^^ 
AssertionError: Model Factory AutoModelForCausalLM not found.

After investigating, I found that the ModelFactoryRegistry._registry only contains 'hf', but config.model_factory is set to 'AutoModelForCausalLM'.

@classmethod
def get(cls, name: str) -> Type[ModelFactory]:
    assert cls.has(name), f"Model Factory {name} not found."
    return cls._registry[name]

When I comment out the assertion and force it to return cls._registry['hf'], the code runs correctly. So I’m wondering: is 'AutoModelForCausalLM' supposed to be registered somewhere and wasn’t, or should I manually change the config to use 'hf' instead? Would appreciate any guidance on the correct usage here.

@classmethod
def get(cls, name: str) -> Type[ModelFactory]:
    # assert cls.has(name), f"Model Factory {name} not found."
    return cls._registry['hf']

May 23 '25 13:05 caoyicheng11