dify
dify copied to clipboard
"Model Parameter max_tokens should be less than or equal to 1500.0." when enabling "Segmenting in Question & Answer format"
Self Checks
- [X] This is only for bug report, if you would like to ask a quesion, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] Pleas do not modify this template :) and fill in all the required fields.
Dify version
0.6.3
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
- Go to the settings of a document in a knowledge base with "Segmenting in Question & Answer format" disabled
- Enable "Segmenting in Question & Answer format"
- An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner.
- The paragraph preview page is blank.
I am using Qwen's text-embedding-v2 model. I cannot find any place in the document or knowledge base settings where I can set the 'max_tokens' parameter mentioned in the error message.
✔️ Expected Behavior
No error message, and generated paragraphs are shown after building embedding.
❌ Actual Behavior
- An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner.
- The paragraph preview page is blank.
The error you're encountering with the "Segmenting in Question & Answer format" feature in Dify version 0.6.3 is due to the max_tokens
parameter exceeding its limit of 1500. This parameter specifies the maximum number of tokens the model can process or generate at a time. Since Dify doesn't provide an option in the GUI to adjust this parameter for Qwen's text-embedding-v2 model, you'll need to look into the application's configuration or codebase for a solution.
One approach is to modify the parameter directly in the configuration file or the code where the model is invoked, typically found in the model's YAML configuration file or the indexing runner script. For Dify, relevant files might include api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml
or api/core/indexing_runner.py
, where you can look for ways to adjust the max_tokens
value.
If direct modification isn't feasible or if you're unsure how to proceed, consider implementing a segmentation strategy that breaks down the input content into smaller chunks that each fall under the max_tokens
limit. Process each chunk individually and then aggregate the results.
For detailed guidance or if there's a mechanism to adjust model parameters through another method in Dify, consulting the official documentation or reaching out to the support team would be advisable.
Sources
- api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml
- api/core/indexing_runner.py
- api/core/model_runtime/model_providers/tongyi/llm/qwen-plus.yaml
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
pls reduce the max token setting when you splitter the document , you LLM model only support 1500 token , because we are using the LLM model to genarate the Q&A @LazarusX
@JohnJyong Thanks for the response. I am using Tongyi's text-embedding-v2. I cannot find anywhere to set the max token, either in the model provider settings, knowledge base settings, or document settings.
Could you point me to the right location?
With the same troubles, it was eventually cancelled, the Q&A.
Please upgrade to latest version to see if this issue persists. :)
I have encountered the same problem, I'm using the latest version 0.6.11, docker, self hosted, with qwen text-embedding-v2 model.
+1
I have encountered the same problem, I'm using the latest version 0.6.16, docker, self hosted, with qwen text-embedding-v2 model.
@crazywoola Looks like the issue still exists. Please help take a look again.