dify icon indicating copy to clipboard operation
dify copied to clipboard

"Model Parameter max_tokens should be less than or equal to 1500.0." when enabling "Segmenting in Question & Answer format"

Open LazarusX opened this issue 10 months ago • 4 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a quesion, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] Pleas do not modify this template :) and fill in all the required fields.

Dify version

0.6.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. Go to the settings of a document in a knowledge base with "Segmenting in Question & Answer format" disabled
  2. Enable "Segmenting in Question & Answer format"
  3. An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner. 20240418-152341
  4. The paragraph preview page is blank.

I am using Qwen's text-embedding-v2 model. I cannot find any place in the document or knowledge base settings where I can set the 'max_tokens' parameter mentioned in the error message.

✔️ Expected Behavior

No error message, and generated paragraphs are shown after building embedding.

❌ Actual Behavior

  1. An error message saying "Model Parameter max_tokens should be less than or equal to 1500.0." pops up on the top right corner. 20240418-152341
  2. The paragraph preview page is blank.

LazarusX avatar Apr 18 '24 07:04 LazarusX

The error you're encountering with the "Segmenting in Question & Answer format" feature in Dify version 0.6.3 is due to the max_tokens parameter exceeding its limit of 1500. This parameter specifies the maximum number of tokens the model can process or generate at a time. Since Dify doesn't provide an option in the GUI to adjust this parameter for Qwen's text-embedding-v2 model, you'll need to look into the application's configuration or codebase for a solution.

One approach is to modify the parameter directly in the configuration file or the code where the model is invoked, typically found in the model's YAML configuration file or the indexing runner script. For Dify, relevant files might include api/core/model_runtime/model_providers/tongyi/llm/qwen-turbo.yaml or api/core/indexing_runner.py, where you can look for ways to adjust the max_tokens value.

If direct modification isn't feasible or if you're unsure how to proceed, consider implementing a segmentation strategy that breaks down the input content into smaller chunks that each fall under the max_tokens limit. Process each chunk individually and then aggregate the results.

For detailed guidance or if there's a mechanism to adjust model parameters through another method in Dify, consulting the official documentation or reaching out to the support team would be advisable.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Apr 18 '24 07:04 dosubot[bot]

pls reduce the max token setting when you splitter the document , you LLM model only support 1500 token , because we are using the LLM model to genarate the Q&A @LazarusX

JohnJyong avatar Apr 29 '24 08:04 JohnJyong

@JohnJyong Thanks for the response. I am using Tongyi's text-embedding-v2. I cannot find anywhere to set the max token, either in the model provider settings, knowledge base settings, or document settings. image image image

Could you point me to the right location?

LazarusX avatar Apr 29 '24 09:04 LazarusX

With the same troubles, it was eventually cancelled, the Q&A.

leeumm avatar May 07 '24 01:05 leeumm

Please upgrade to latest version to see if this issue persists. :)

crazywoola avatar May 28 '24 07:05 crazywoola

I have encountered the same problem, I'm using the latest version 0.6.11, docker, self hosted, with qwen text-embedding-v2 model.

StephenPCG avatar Jun 17 '24 10:06 StephenPCG

+1

kaojistream avatar Jun 20 '24 05:06 kaojistream

I have encountered the same problem, I'm using the latest version 0.6.16, docker, self hosted, with qwen text-embedding-v2 model.

charliex2 avatar Aug 08 '24 04:08 charliex2

@crazywoola Looks like the issue still exists. Please help take a look again.

LazarusX avatar Aug 08 '24 05:08 LazarusX