dify icon indicating copy to clipboard operation
dify copied to clipboard

Dify : Query or prefix is too long

Open sjn920336697 opened this issue 2 years ago • 6 comments

Dify version

0.3.32

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Dify framework reports the error : “the query or prefix prompt is too long. You can reduce the prefix prompt, shrink the max token, or switch to a LLM with a larger token limit size.”

The backend model(ChatGLM2) is based on Xinference and deployed locally, the maximum character has been set to 8192 (8K).

I have no problem interacting with Xinference using a Chinese text of more than 2000 characters. However, when I interact with an application built on Dify, it reminds me that the length exceeds the limit, where max_tokens is set to 450.

The same problem also occurs on the OpenLLM framework

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

sjn920336697 avatar Dec 06 '23 16:12 sjn920336697

I meet the same problem.

gqchen-dz avatar Dec 07 '23 01:12 gqchen-dz

image The max token means it's the length of the reply itself. But the `8k = max_token + prefix_prompt`.

crazywoola avatar Dec 07 '23 01:12 crazywoola

Thx for your reply, but the 8k > max_token(450) + prefix_prompt(about 100) + content(about 2000)。 Unfortunately,I reduce max_tokens to 50, it still doesn’t work。

I have no problem interacting with Xinference using a Chinese text of prefix_prompt(about 100) + content(about 3000 >(2000+450)), this made me confuse

sjn920336697 avatar Dec 07 '23 07:12 sjn920336697

In my scene,the backend model is xinrerence and chatglm3-6b-32k . dify max_tokens sets to 500 It's no problems when interacting with xinference , content length 2164 Chinese characters (include Prefix_prompt+question), but in dify,the same content (include Prefix_prompt+question) , get the error. Is there any way to adjust this ?

gqchen-dz avatar Dec 07 '23 07:12 gqchen-dz

In my scene,the backend model is xinrerence and chatglm3-6b-32k . dify max_tokens sets to 500 It's no problems when interacting with xinference , content length 2164 Chinese characters (include Prefix_prompt+question), but in dify,the same content (include Prefix_prompt+question) , get the error. Is there any way to adjust this ?

the same problem

sjn920336697 avatar Dec 07 '23 07:12 sjn920336697

Do we need to upgrade the deps to the latest? @takatost

crazywoola avatar Dec 07 '23 11:12 crazywoola