Dify : Query or prefix is too long
Dify version
0.3.32
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Dify framework reports the error : “the query or prefix prompt is too long. You can reduce the prefix prompt, shrink the max token, or switch to a LLM with a larger token limit size.”
The backend model(ChatGLM2) is based on Xinference and deployed locally, the maximum character has been set to 8192 (8K).
I have no problem interacting with Xinference using a Chinese text of more than 2000 characters. However, when I interact with an application built on Dify, it reminds me that the length exceeds the limit, where max_tokens is set to 450.
The same problem also occurs on the OpenLLM framework
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
I meet the same problem.
Thx for your reply, but the 8k > max_token(450) + prefix_prompt(about 100) + content(about 2000)。
Unfortunately,I reduce max_tokens to 50, it still doesn’t work。
I have no problem interacting with Xinference using a Chinese text of prefix_prompt(about 100) + content(about 3000 >(2000+450)), this made me confuse
In my scene,the backend model is xinrerence and chatglm3-6b-32k . dify max_tokens sets to 500 It's no problems when interacting with xinference , content length 2164 Chinese characters (include Prefix_prompt+question), but in dify,the same content (include Prefix_prompt+question) , get the error. Is there any way to adjust this ?
In my scene,the backend model is xinrerence and chatglm3-6b-32k . dify max_tokens sets to 500 It's no problems when interacting with xinference , content length 2164 Chinese characters (include Prefix_prompt+question), but in dify,the same content (include Prefix_prompt+question) , get the error. Is there any way to adjust this ?
the same problem
Do we need to upgrade the deps to the latest? @takatost