text-generation-inference
text-generation-inference copied to clipboard
I encountered the same issue while using `baichuan2-13B-chat`..
I encountered the same issue while using `baichuan2-13B-chat`..
I extracted the chat parameters from baichuan2's generation_config.json, and when I call the tgi interface, the result is as follows.
When I invoke the chat method, the result is as follows.
Here are the deployment parameters for the tgi.
--max-batch-prefill-tokens 4096 --max-input-length 4096 --max-total-tokens 4608
Originally posted by @zTaoplus in https://github.com/huggingface/text-generation-inference/issues/981#issuecomment-1734882450