dify "Model Parameter max_tokens should be less than or equal to 1500.0." when enabling "Segmenting in Question & Answer format"

Self Checks

[X] This is only for bug report, if you would like to ask a quesion, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] Pleas do not modify this template :) and fill in all the required fields.

Dify version

0.6.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Go to the settings of a document in a knowledge base with "Segmenting in Question & Answer format" disabled
Enable "Segmenting in Question & Answer format"
An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner.
The paragraph preview page is blank.

I am using Qwen's text-embedding-v2 model. I cannot find any place in the document or knowledge base settings where I can set the 'max_tokens' parameter mentioned in the error message.

✔️ Expected Behavior

No error message, and generated paragraphs are shown after building embedding.

❌ Actual Behavior

An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner.
The paragraph preview page is blank.

Apr 18 '24 07:04 LazarusX

The error you're encountering with the "Segmenting in Question & Answer format" feature in Dify version 0.6.3 is due to the max_tokens parameter exceeding its limit of 1500. This parameter specifies the maximum number of tokens the model can process or generate at a time. Since Dify doesn't provide an option in the GUI to adjust this parameter for Qwen's text-embedding-v2 model, you'll need to look into the application's configuration or codebase for a solution.

One approach is to modify the parameter directly in the configuration file or the code where the model is invoked, typically found in the model's YAML configuration file or the indexing runner script. For Dify, relevant files might include api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml or api/core/indexing_runner.py, where you can look for ways to adjust the max_tokens value.

If direct modification isn't feasible or if you're unsure how to proceed, consider implementing a segmentation strategy that breaks down the input content into smaller chunks that each fall under the max_tokens limit. Process each chunk individually and then aggregate the results.

For detailed guidance or if there's a mechanism to adjust model parameters through another method in Dify, consulting the official documentation or reaching out to the support team would be advisable.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

Apr 18 '24 07:04 dosubot[bot]

pls reduce the max token setting when you splitter the document , you LLM model only support 1500 token , because we are using the LLM model to genarate the Q&A @LazarusX

Apr 29 '24 08:04 JohnJyong

@JohnJyong Thanks for the response. I am using Tongyi's text-embedding-v2. I cannot find anywhere to set the max token, either in the model provider settings, knowledge base settings, or document settings.