opencode icon indicating copy to clipboard operation
opencode copied to clipboard

max_tokens defaults to 32000 when using a custom provider

Open nmartorell opened this issue 4 months ago • 3 comments

Hi,

I'm using LLM (Anthropic, OpenAI and Bedrock) models through an OpenAI-compliant LLM API Gateway. I configured a custom provider in opencode to use this LLM Gateway using the opencode.json file, e.g.

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "myprovider": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Custom LLM Gateway",
      "options": {
        "baseURL": "<GATEWAY_URL",
        "apiKey": "<API_KEY>",
      },
      "models": {
        "openai:gpt-4o-mini": {
          "name": "gpt-4o-mini"
        }
        "anthropic:claude-3-5-haiku-20241022": {
          "name": "claude-3-5-haiku-20241022"
        }
      }
    }
  }
}

When I try to use either of these models through opencode, I'm receiving error messages that suggest that something (maybe opencode, maybe the Vercel AI SDK) is defaulting the max_tokens OpenAI Completions field to 32000, which unfortunately is far too large for these models.

As an example, here is the error I see in the LLM Gateway logs when attempting to use the gpt-4o model (I see a similar error with Anthropic models):

  "error": {
    "message": "max_tokens is too large: 32000. This model supports at most 16384 completion tokens, whereas you provided 32000.",
    "type": "invalid_request_error",
    "param": "max_tokens",
    "code": "invalid_value"
  }
}

I've tried searching the opencode and Vercel AI SDK codebases to try to find where this max_tokens value is being set, but I unfortunately can't find it. Assuming it is being set by opencode, it would be ideal if this value were left unset when using custom providers, as in the case of my LLM Gateway, it already keeps track of the max_token parameter for each provider / model.

A couple other notes:

  • The reason I know the max_tokens parameter is being set by opencode or Vercel SDK, is because when I write my own http request to my LLM Gateway without specifying max_tokens, the query goes through, e.g.
curl --header 'Authorization: Bearer <API TOKEN>' -H "Content-Type: application/json" -X POST --data '{"model":"openai:gpt-4o-mini","messages":[ {"role": "user", "content": "Hello!"}]}' <GATEWAY_URL>/v1/chat/completions
  • When I modify the opencode.json file to include context and max output tokens, opencode works as expected (I know this is a workaround, but really trying not to have to hardcode context and output tokens in both places):
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "myprovider": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Custom LLM Gateway",
      "options": {
        "baseURL": "<GATEWAY_URL",
        "apiKey": "<API_KEY>",
      },
      "models": {
        "openai:gpt-4o-mini": {
          "name": "gpt-4o-mini",
          "limit": {
            "context": 10000,
            "output": 5000
          }
        }
        "anthropic:claude-3-5-haiku-20241022": {
          "name": "claude-3-5-haiku-20241022",
          "limit": {
            "context": 10000,
            "output": 5000
          }
        }
      }
    }
  }
}

Please let me know if my understanding is wrong, and this parameter is being set elsewhere. Also, please let me know if there is any additional information required to troubleshoot.

Thank you, and thanks for making such an awesome tool!

nmartorell avatar Aug 08 '25 22:08 nmartorell