Max Tokens Parameter Not Functioning Correctly
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.9.1-fix1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
max_tokens parameter is supposed to limit the number of tokens processed, it appears not to be functioning correctly. For instance, in the Claude 3.5 Sonnet model, the official maximum token limit is indicated as 8192 tokens. However, I was able to input more than 10,000 tokens without encountering any errors or warnings, and the model responded without enforcing the token limit. In models like Sonnet or Gemini, we cannot activate the max token parameter, but in OpenAI models, we canAdditionally, could you please provide clarification on what the maximum token limit is and how I can change it?
✔️ Expected Behavior
The model should show an error message indicating "You have reached the limit of max tokens" and should not provide a response if the token limit is exceeded.
❌ Actual Behavior
I was able to input more than 10,000 tokens, and the model operated normally without any indication of exceeding the limit.
I found some similar discussions and issues related to the max_tokens parameter not functioning correctly:
-
Max Tokens Parameter Not Functioning Correctly - This discussion is ongoing and might provide insights or solutions to the issue you're experiencing [1].
-
Max Token Limits Incorrect? - Another discussion that might be relevant to your problem [2].
Additionally, there was a closed issue titled Wrong max tokens limit for OpenRouter Sonnet 3.5, which was resolved by a pull request that corrected the max_tokens for the model [3].
Regarding your question about the maximum token limit and how to change it, the default setting for max_tokens in the Dify configuration files is 512, but it can be adjusted based on the model's capabilities. The implementation logic for max_tokens involves checking the model configuration parameters and recalculating max_tokens if the sum of prompt tokens and max_tokens exceeds the model's token limit [4].
If the max_tokens parameter is not functioning as expected, it might be worth checking if there are any updates or patches available for your version of Dify, as similar issues have been resolved in past updates.
To continue talking to Dosu, mention @dosu.
Same here https://github.com/langgenius/dify/issues/9904 You are mixing it with the context size, the claude models have very large context size, typically you can not exceed it without transferring a large file. You can hover it to the question mark to see it's meaning.