user trtllm-serve error
System Info
RuntimeError: Failed to import transformers.models.bert.modeling_bert because of the following error (look up to see its traceback): /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Who can help?
No response
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
trtllm-serve error
Traceback (most recent call last):
File "/usr/local/bin/trtllm-serve", line 5, in
Expected behavior
d
actual behavior
?
additional notes
d
Which release are you working with? Can you try out with our main? If there's still an issue, please share steps to reproduce alongside hardware details.
I encountered the same issue.
Environment:
- Base image:
nvcr.io/nvidia/pytorch:25.04-py3 - I ran the following steps:
[ -f /etc/pip/constraint.txt ] && : > /etc/pip/constraint.txt
pip uninstall -y tensorrt
pip install tensorrt-llm
cd TensorRT-LLM/examples/auto_deploy
python build_and_run_ad.py --config '{"model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0"}'
But I got the same error during execution.
I'm also getting similar error in ngc container nvcr.io/nvidia/pytorch:25.04-py3 with tensorrt_llm-0.19.0
In [1]: import tensorrt_llm
ImportError: /usr/local/lib/python3.12/dist-packages/tensorrt_llm/bindings.cpython-312-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs
@lucaslie , do you think you can answer the question about auto deploy?
same problem. But 0.20.0rc3 works. I think you should use earlier image of pytorch like nvcr.io/nvidia/pytorch:25.01-py3 (not tested)
Thanks, I can resolve it by switching to nvcr.io/nvidia/pytorch:25.01-py3 with TensorRT-LLM version: 0.20.0rc3. However, when I run python build_and_run_ad.py --config '{"model": "TinyLlama/TinyLlama-1.1B-Chat-v1.0"}', I now get an AssertionError: Model Factory AutoModelForCausalLM not found..
Traceback (most recent call last):
File "/data_8t_1/qby/build_and_run_ad.py", line 135, in <module>
main()
File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/data_8t_1/qby/build_and_run_ad.py", line 97, in main
llm = build_llm_from_config(config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data_8t_1/qby/build_and_run_ad.py", line 60, in build_llm_from_config
factory = ModelFactoryRegistry.get(config.model_factory)(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/tensorrt_llm/_torch/auto_deploy/models/factory.py", line 154, in get
assert cls.has(name), f"Model Factory {name} not found."
^^^^^^^^^^^^^
AssertionError: Model Factory AutoModelForCausalLM not found.
After investigating, I found that the ModelFactoryRegistry._registry only contains 'hf', but config.model_factory is set to 'AutoModelForCausalLM'.
@classmethod
def get(cls, name: str) -> Type[ModelFactory]:
assert cls.has(name), f"Model Factory {name} not found."
return cls._registry[name]
When I comment out the assertion and force it to return cls._registry['hf'], the code runs correctly. So I’m wondering: is 'AutoModelForCausalLM' supposed to be registered somewhere and wasn’t, or should I manually change the config to use 'hf' instead? Would appreciate any guidance on the correct usage here.
@classmethod
def get(cls, name: str) -> Type[ModelFactory]:
# assert cls.has(name), f"Model Factory {name} not found."
return cls._registry['hf']