[Bug] Environment Variable CLAUDE_CODE_MAX_OUTPUT_TOKENS Not Respected
Bug Description CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable gets ignored
Environment Info
- Platform: linux
- Terminal: kitty
- Version: 1.0.120
- Feedback ID: ea5ee211-67b4-4ee4-933e-dfeca4f5c48e
Have also seen this issue;
Environment Info
Platform: MacOS against AWS Bedrock (Opus 4.1) Terminal: IntelliJ Terminal Version: 1.0.120
Claude Code v1.0.120
L Session ID: 6c721f24-bd9b-4ccd-84dd-3f98811ec084
Working Directory
L omitted
IDE Integration • /config
✔ Connected to IntelliJ IDEA extension
⚠ Error installing IntelliJ IDEA plugin: Plugin source missing
Please restart your IDE or try installing from https://docs.claude.com/s/claude-code-jetbrains
API Configuration
L API Provider: AWS Bedrock
L AWS Region: us-west-2
Memory • /memory
L project: CLAUDE.md
Model • /model
L arn:aws:bedrock:::inference-profile/us.anthropic.claude-opus-4-20250514-v1:0
⎿ API Error: 400 Input is too long for requested model.
> /compact
⎿ Error: Error during compaction: Error: API Error: 400 input length and `max_tokens` exceed context limit: 202111 + 20000 > 204658, decrease input length or
`max_tokens` and try again
! echo $CLAUDE_CODE_MAX_OUTPUT_TOKENS
⎿ 2048
I encountered the same issue—the parameter settings are not taking effect
export CLAUDE_CODE_MAX_OUTPUT_TOKENS=50000
⎿ API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable. ⎿ Error writing file ⎿ API Error: Claude's response exceeded the 32000 output token maximum. To configure this behavior, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable.
✶ Creating test script for test_case_19… (esc to interrupt · ctrl+t to show todos · 462s · ↓ 7.4k tokens) ⎿ Next: Create test script for test_case_20
I'm seeing this constantly with AWS: global.anthropic.claude-sonnet-4-5-20250929-v1:0 too but not with the ENV var. Just out of the box.
Interesting! These cases are actually the opposite of what i wanted to achieve: set max_tokens to 64000 so claude can produce longer outputs.
The current implementation seems like a unnecessary mess that's easily fixable:
- When the response stops because of max tokens => use that response instead of api error + deleting the response
- Dynamic max tokens: if the request + max_tokens exceeds context limit => reduce max_tokens to fit context limit
This issue has been inactive for 30 days. If the issue is still occurring, please comment to let us know. Otherwise, this issue will be automatically closed in 30 days for housekeeping purposes.