Bedrock Claude 3.5 V1 inferenceConfig maxTokens / max_tokens ignore 8192 even when set, is always 4096
Describe the bug
Bedrock Claude 3.5 V1 calls ignore max_tokens, always set to 4096 even if requested to be 8192.
{
"schemaType": "ModelInvocationLog",
"schemaVersion": "1.0",
"timestamp": "2025-05-06T08:44:53Z",
"accountId": "redacted",
"identity": {
"arn": "redacted"
},
"region": "eu-central-1",
"requestId": "1d081c6a-9b89-422b-99de-c0056b8008bc",
"operation": "Converse",
"modelId": "arn:aws:bedrock:eu-central-1:redacted:inference-profile/eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
"input": {
"inputContentType": "application/json",
"inputBodyJson": {
"messages": [
{
"role": "user",
"content": [
{
"text": "redacted"
}
]
}
],
"system": [
{
"text": "redacted"
}
],
"inferenceConfig": {
"maxTokens": 8192, // <---- THIS
"temperature": 0.7
},
"toolConfig": {
"tools": ["redacted"],
"toolChoice": {
"tool": {
"name": "redacted"
}
}
},
"additionalModelRequestFields": {}
},
"inputTokenCount": 2085
},
"output": {
"outputContentType": "application/json",
"outputBodyJson": {
"output": {
"message": {
"role": "assistant",
"content": [
{
"toolUse": {
"toolUseId": "tooluse_KSDC5-7STHupgDNlmsJdIQ",
"name": "redacted",
"input": {
"status": "success",
"command": "update",
"collection": "redacted",
"item_id": "redacted"
}
}
}
]
}
},
"stopReason": "max_tokens",
"metrics": {
"latencyMs": 87074
},
"usage": {
"inputTokens": 2085,
"outputTokens": 4096, // <---- THIS
"totalTokens": 6181
}
},
"outputTokenCount": 4096 // <---- THIS
},
"inferenceRegion": "eu-central-1"
}
Regression Issue
- [ ] Select this option if this issue appears to be a regression.
Expected Behavior
maxTokens should be adhered to and the stop reason should never occur, specifically not on max_tokens.
Current Behavior
Requests with tokens over >~4k should not be stopped.
Reproduction Steps
- Use Claude 3.5 v1 model and generate a prompt with response over 4096
- set max_tokens to 8192
Possible Solution
No response
Additional Information/Context
No response
SDK version used
1.38.3
Environment details (OS name and version, etc.)
Ubuntu
Hello @ory-name-tbd, thanks for reaching out. I have replicated the issue and got the same behavior where outputTokens does not follow the maxTokens. Since this is a Bedrock issue, I have reached out to the Bedrock service team in this regards. I will update as soon as there are any updates. Please let me know if you have further questions. Thank you.
For Internal Tracking: P247132763
Thank you @adev-code for raising this to the Bedrock team. Any update from them as of today? Is the issue planned to be addressed? Or is there a better place to follow on this problem.
Would be nice to have some info as the problem exists for quite some time now. See https://github.com/boto/boto3/issues/4279#issuecomment-2361634590
Hello thanks for the patience and an update has been given from the team. The Bedrock team has mentioned that Sonnet 3.5v1 has a max output token limit of 4096.
This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.