OpenAI API Breaking Change - Batch Processing Fails with "max_tokens" Parameter Error
Archon Version
Latest Docker images (as of 2025-08-28)
Bug Severity
🟠 High - Blocks important features
Bug Description
When crawling documentation sites for the knowledge base, the batch processing fails during the contextual embedding generation phase due to an OpenAI API parameter change (GPT 5 Nano). The system attempts to use max_tokens with newer OpenAI models that require max_completion_tokens instead.
Steps to Reproduce
- Go to Knowledge Base page
- Click "Add Knowledge"
- Enter URL: https://docs.anthropic.com (or any documentation site)
- Click "Add Source"
- Crawling starts successfully
- Batch processing begins but gets stuck at "Processing batch 1 of X"
- Check Docker logs to see the error
Expected Behavior
The site should be crawled successfully, contextual embeddings generated, and the content added to the knowledge base for RAG queries.
Actual Behavior
Crawling completes but batch processing fails silently in the UI. The process appears stuck at "Processing batch 1 of X" indefinitely. No contextual embeddings are generated (0/25 successful).
Error Details (if any)
2025-08-28 03:32:30 | src.server.services.llm_provider_service | ERROR | Error creating LLM client for provider openai: Error code: 400 - {'error': {'message': "Unsupported
parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code':
'unsupported_parameter'}}
2025-08-28 03:32:30 | search | ERROR | Error in contextual embedding batch: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this
model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
2025-08-28 03:32:30 | search | INFO | Batch 1: Generated 0/25 contextual embeddings using batch API (sub-batch size: 50)
Affected Component
🔍 Knowledge Base / RAG
Browser & OS
Chrome on macOS
Additional Context
The issue occurs because Archon is configured to use gpt-5-nano-2025-08-07, a newer OpenAI model that has deprecated the max_tokens parameter in favor of max_completion_tokens. This is a breaking change from OpenAI's API.
Either:
- Update the LLM provider service to use max_completion_tokens when detecting newer OpenAI models
- Switch default model to one that still supports max_tokens (e.g., gpt-4-turbo)
- Add model compatibility detection and parameter mapping
Service Status (check all that are working)
- [x] 🖥️ Frontend UI (http://localhost:3737)
- [x] ⚙️ Main Server (http://localhost:8181)
- [x] 🔗 MCP Service (localhost:8051)
- [x] 🤖 Agents Service (http://localhost:8052)
- [x] 💾 Supabase Database (connected)
We dont support gpt 5 just yet, but thanks for reporting, im adding to the backlog