trt-llm-rag-windows
trt-llm-rag-windows copied to clipboard
Error Code 1: Serialization (Serialization assertion stdVersionRead == kSERIALIZATION_VERSION failed.Version tag does not match.
When I run the command to start the application, I get the version mismatch error:
The command:
python app.py --trt_engine_path model/ --trt_engine_name llama_float16_tp1_rank0.engine --tokenizer_dir_path Llama-2-13b-chat-hf --data_dir dataset/
The error:
Error Code 1: Serialization (Serialization assertion stdVersionRead == kSERIALIZATION_VERSION failed.Version tag does not match. Note: Current Version: 228, Serialized Engine Version: 226)
Traceback (most recent call last):
File "C:\Users\unubi\trt-llm-rag-windows\app.py", line 63, in <module>
llm = TrtLlmAPI(
File "C:\Users\unubi\trt-llm-rag-windows\trt_llama_api.py", line 166, in __init__
decoder = tensorrt_llm.runtime.GenerationSession(self._model_config,
File "C:\Users\unubi\anaconda3\envs\myenv\lib\site-packages\tensorrt_llm\runtime\generation.py", line 457, in __init__
self.runtime = _Runtime(engine_buffer, mapping)
File "C:\Users\unubi\anaconda3\envs\myenv\lib\site-packages\tensorrt_llm\runtime\generation.py", line 150, in __init__
self.__prepare(mapping, engine_buffer)
File "C:\Users\unubi\anaconda3\envs\myenv\lib\site-packages\tensorrt_llm\runtime\generation.py", line 168, in __prepare
assert self.engine is not None
AssertionError
Exception ignored in: <function _Runtime.__del__ at 0x000001F6C370CA60>
Traceback (most recent call last):
File "C:\Users\unubi\anaconda3\envs\myenv\lib\site-packages\tensorrt_llm\runtime\generation.py", line 266, in __del__
cudart.cudaFree(self.address) # FIXME: cudaFree is None??
AttributeError: '_Runtime' object has no attribute 'address'
Got same issue with pre-built 4090 engine.
Hello, which TRT-LLM version are you using to build the engine? For compatibility, the engine should be built with TRT-LLM version 0.5
@MustaphaU and @teis-e : Can you please share the command used to install the wheel?
@shishirgoyal85 @anujj
Installing tensorrt_llm version 0.5 i.e. pip install tensorrt_llm==0.5 --extra-index-url https://pypi.nvidia.com/ --extra-index-url https://download.pytorch.org/whl/cu121
results in the error:
ERROR: No matching distribution found for torch==2.1.0.dev20230828+cu121
Here is the full log:
Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com, https://download.pytorch.org/whl/cu121
Collecting tensorrt_llm==0.5
Using cached https://pypi.nvidia.com/tensorrt-llm/tensorrt_llm-0.5.0-0-cp310-cp310-win_amd64.whl (431.5 MB)
Collecting build (from tensorrt_llm==0.5)
Using cached build-1.0.3-py3-none-any.whl.metadata (4.2 kB)
INFO: pip is looking at multiple versions of tensorrt-llm to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement torch==2.1.0.dev20230828+cu121 (from tensorrt-llm) (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1, 2.1.0, 2.1.0+cu121, 2.1.1, 2.1.1+cu121, 2.1.2, 2.1.2+cu121, 2.2.0, 2.2.0+cu121)
ERROR: No matching distribution found for torch==2.1.0.dev20230828+cu121
@MustaphaU : The below command worked for me for trt-llm 0.5v
pip install tensorrt-llm==0.5.0.post1 --extra-index-url https://pypi.nvidia.com --extra-index-url https://download.pytorch.org/whl/nightly/cu121 --extra-index-url https://download.pytorch.org/whl/cu121
we just release a updated version 0.3 . Please use that branch and follow readme: https://github.com/NVIDIA/ChatRTX/blob/release/0.3/README.md to setup the application