I encountered the same issue while using `baichuan2-13B-chat`..

Open Lacacy opened this issue 1 year ago • 0 comments

          I encountered the same issue while using `baichuan2-13B-chat`..

I extracted the chat parameters from baichuan2's generation_config.json, and when I call the tgi interface, the result is as follows.

When I invoke the chat method, the result is as follows.

Here are the deployment parameters for the tgi.

--max-batch-prefill-tokens 4096 --max-input-length 4096 --max-total-tokens 4608

Originally posted by @zTaoplus in https://github.com/huggingface/text-generation-inference/issues/981#issuecomment-1734882450

Nov 26 '24 01:11 Lacacy