OpenAI API Breaking Change - Batch Processing Fails with "max_tokens" Parameter Error

Open builtbyrobben opened this issue 3 months ago • 1 comments

Archon Version

Latest Docker images (as of 2025-08-28)

Bug Severity

🟠 High - Blocks important features

Bug Description

When crawling documentation sites for the knowledge base, the batch processing fails during the contextual embedding generation phase due to an OpenAI API parameter change (GPT 5 Nano). The system attempts to use max_tokens with newer OpenAI models that require max_completion_tokens instead.

Steps to Reproduce

Go to Knowledge Base page
Click "Add Knowledge"
Enter URL: https://docs.anthropic.com (or any documentation site)
Click "Add Source"
Crawling starts successfully
Batch processing begins but gets stuck at "Processing batch 1 of X"
Check Docker logs to see the error

Expected Behavior

The site should be crawled successfully, contextual embeddings generated, and the content added to the knowledge base for RAG queries.

Actual Behavior

Crawling completes but batch processing fails silently in the UI. The process appears stuck at "Processing batch 1 of X" indefinitely. No contextual embeddings are generated (0/25 successful).

Error Details (if any)

2025-08-28 03:32:30 | src.server.services.llm_provider_service | ERROR | Error creating LLM client for provider openai: Error code: 400 - {'error': {'message': "Unsupported
  parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code':
  'unsupported_parameter'}}
  2025-08-28 03:32:30 | search | ERROR | Error in contextual embedding batch: Error code: 400 - {'error': {'message': "Unsupported parameter: 'max_tokens' is not supported with this
  model. Use 'max_completion_tokens' instead.", 'type': 'invalid_request_error', 'param': 'max_tokens', 'code': 'unsupported_parameter'}}
  2025-08-28 03:32:30 | search | INFO | Batch 1: Generated 0/25 contextual embeddings using batch API (sub-batch size: 50)

Affected Component

🔍 Knowledge Base / RAG

Browser & OS

Chrome on macOS

Additional Context

The issue occurs because Archon is configured to use gpt-5-nano-2025-08-07, a newer OpenAI model that has deprecated the max_tokens parameter in favor of max_completion_tokens. This is a breaking change from OpenAI's API.

Either:

Update the LLM provider service to use max_completion_tokens when detecting newer OpenAI models
Switch default model to one that still supports max_tokens (e.g., gpt-4-turbo)
Add model compatibility detection and parameter mapping

Service Status (check all that are working)

[x] 🖥️ Frontend UI (http://localhost:3737)
[x] ⚙️ Main Server (http://localhost:8181)
[x] 🔗 MCP Service (localhost:8051)
[x] 🤖 Agents Service (http://localhost:8052)
[x] 💾 Supabase Database (connected)

Aug 28 '25 03:08 builtbyrobben

We dont support gpt 5 just yet, but thanks for reporting, im adding to the backlog

Aug 28 '25 10:08 Wirasm