gateway
gateway copied to clipboard
Gateway error when processing non-200 model response
trafficstars
What Happened?
Issue Description
The gateway fails to properly parse error messages when accessing Claude models on GCP Vertex AI via the streaming endpoint. The issue occurs when the model returns a non-200 response.
Environment
- Gateway Endpoint:
/v1/chat/completions - Model:
anthropic.claude-3-5-sonnet@20240620 - Provider: GCP Vertex AI
- Gateway is locally hosted
Configuration
{
"strategy": {
"mode": "loadbalance"
},
"targets": [
{
"provider": "vertex-ai",
"vertex_project_id": "dev",
"vertex_service_account_json": "SA",
"vertex_region": "us-east5",
"weight": 1
}
]
}
Error Output
Gateway logs:
event: error
2024-11-01 11:43:41 ^
2024-11-01 11:43:41
2024-11-01 11:43:41 SyntaxError: Unexpected token 'e', "event: err"... is not valid JSON
2024-11-01 11:43:41 at JSON.parse (<anonymous>)
2024-11-01 11:43:41 at Xt (file:///app/build/start-server.js:2:71350)
2024-11-01 11:43:41 at file:///app/build/start-server.js:2:146756
2024-11-01 11:43:41 at async file:///app/build/start-server.js:2:146361
Application logs:
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Root Cause Analysis
- The error occurs when the model returns a non-200 response
- Verified by running parallel requests directly against the GCP API
Additional Notes
- This issue reproduces with other configurations, including fallback mode
- The error is not specific to the provided configuration
What Should Have Happened?
The gateway should return an appropriate error response to the application.
Hitting the GCP api directly returns:
529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"}}
Relevant Code Snippet
No response
Your Twitter/LinkedIn
https://www.linkedin.com/in/maxkrueger1/