ragflow Raptor: [Bug]: **ERROR**: Unknown field for GenerationConfig: max

Self Checks

[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (Language Policy).
[x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
[x] Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

dunno

RAGFlow image version

v0.19.0 slim

Other environment information

AMD Ryzen 7700 / 96GB RAM
Proxmox VM - Debian 12

Actual behavior

Error, see below

Expected behavior

Parse the File

Steps to reproduce

- Install v0.19.0 slim
- Assign Embedding / Chat Model
- Create Knowledge Base
- Turn on Raptor
- Upload PDF & Parse

Additional information

Fresh install, just used Ollama with nomic-embed-text:latest for embedding and Google Gemini Pro 2.5 for chat. Ceated a Knowledge Base, turned on Raptor to chek it out. Uploaded my first PDF and hit the "play" button to start parsing. Its creating Chunks and fails starting raptor - raptor turned off = working (on a different Knowledge Base, once i turned on Raptor and tried, i get the same error even if i turn raptor off again...).

Begin at: Tue, 17 Jun 2025 10:26:18 GMT Duration: 33.27 s Progress: 10:26:44 Page(1~6): Finish parsing. 10:26:44 Page(1~6): Generate 4 chunks 10:26:44 Page(1~6): Embedding chunks (0.27s) 10:26:44 Page(1~6): Indexing done (0.03s). Task done (23.53s) 10:26:45 created task raptor 10:26:46 Task has been received. 10:26:47 [ERROR][Exception]: Exceptions from Trio nursery (3 sub-exceptions) -- ERROR: Unknown field for GenerationConfig: max_tokens

Jun 17 '25 08:06 Drasek

Hi @Drasek! I'm Dosu and I’m helping the ragflow team.

The error happens because the max_tokens field was removed from RAGFlow’s GenerationConfig starting in v0.17.2, and this change is present in v0.19.0 slim as well. The current code and the Huggingface transformers library (which RAGFlow uses for most LLMs) do not accept max_tokens—only max_new_tokens is valid. If your config, UI, or workflow still references max_tokens, you’ll get this error.

To fix it, remove any max_tokens references from your configs and manage token limits through your model provider’s settings instead. For Huggingface-based models, use max_new_tokens if you need to set a generation limit. If you’re using an API like OpenAI or Anthropic, those still use max_tokens, but not the Huggingface/transformers path in RAGFlow. More details are in the release notes and related issues.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Jun 17 '25 08:06 dosubot[bot]

Hey @dosu, i am lost - this is a fresh install, only thing i added was 2 ollama models (and connect Google Gemini) and i need to put in "Max Tokens" in the UI to add a model with ollama, its a mandatory field. I am not using Hugging Face (directly). Please point me where i need to change the config, or what i did wrong. I have not touched chat, agents, whatsoever - fresh install - added modells - created a knowledge base, turned on raptor - added 1 file -> Error.

Jun 17 '25 08:06 Drasek

The "Max Tokens" field is required for Ollama models in the UI because RAGFlow uses it to set the num_predict parameter for the Ollama API—it’s not optional for Ollama integration and is handled correctly there. The error you’re seeing happens when this max_tokens config leaks into other parts of RAGFlow (like Raptor or non-Ollama models), which expect a different config and don’t recognize max_tokens, causing the crash (source).

You didn’t do anything wrong in the Ollama setup. The issue is that RAGFlow doesn’t always filter out Ollama-specific fields before passing configs to other model backends. There isn’t a user-facing config to fix this right now. As a workaround, try making sure your knowledge base and Raptor tasks are only using Ollama models for both embedding and chat, and avoid mixing in models like Gemini or others for now. If you must use multiple providers, you may need to wait for an upstream fix that better isolates model-specific config fields.

If you want to double-check, you can look at your model assignments for the knowledge base and make sure only Ollama models are selected for both embedding and chat. If you’re still hitting the error with only Ollama models, let me know—there may be a deeper bug.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Jun 17 '25 08:06 dosubot[bot]

@dosu Switching to Qwen3:32b on ollama solved it thank you. I need a multi language embedding model (for german), thats why i use nomic-embed-text - which platform can i use, which does not require the Max Tokens field to setup to host the embedding model locally, so i am free to choose a big llm for chat on a provider like google, deepseek, etc. ? vllm?

Jun 17 '25 09:06 Drasek

You can use vLLM to host your multi-language embedding model (like nomic-embed-text) locally—vLLM integration in RAGFlow does not require the "Max Tokens" field for embedding models, so you won’t hit the same config issue as with Ollama. This setup lets you run your embedding model locally and freely choose a large LLM for chat from providers like Google or Deepseek, as RAGFlow separates embedding and chat model configs and supports mixed-provider setups as long as the frontend doesn’t enforce contradictory requirements. Huggingface and direct API backends are also flexible for embeddings if you want alternatives to vLLM.

If you run into any UI field enforcement for "Max Tokens" when registering a non-Ollama embedding model, try using the API or config files directly, as the backend itself does not require it for vLLM or Huggingface embeddings.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Jun 17 '25 09:06 dosubot[bot]

Raptor: [Bug]: **ERROR**: Unknown field for GenerationConfig: max_tokens

Self Checks

RAGFlow workspace code commit ID

RAGFlow image version

Other environment information

Actual behavior

Expected behavior

Steps to reproduce

Additional information

Raptor: [Bug]: ERROR: Unknown field for GenerationConfig: max_tokens