tensorrtllm_backend
tensorrtllm_backend copied to clipboard
[Bugfix]fix the thread lock when user input same id
deploy with the IFB, when user input the paylod as follow:
{
"text_input": str(question),
"max_tokens": 512,
"bad_words": "",
"stop_words": stop_words,
"pad_id": pad_id,
"end_id": end_id,
"top_p": 1,
"id": "ggbond_test",
"temperature": 0.0000001
}
if every pyload‘s id is same, will cause the error:
Exception in thread Thread-1 (awaiter_loop): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() self._target(*self._args, **self._kwargs) File "/code/model_pipeline_name/tensorrt_llm/1/model.py", line 920, in awaiter_loop del self.triton_user_id_to_req_ids[ KeyError: 'ggbond_test'