stripping anthropic version from request body
Is there an existing issue for this?
- [x] I have searched the existing issues
Kong version ($ kong version)
3.9.0
Current Behavior
When using the Kong AI proxy plugin to connect with Anthropic Claude3 models through VertexAI for the llm/v1/chat endpoint, encountered an error message from VertexAI indicating "missing anthropic_version" in the request body, despite having included it in your request while triggering the kong ai proxy route.
In Vertex, anthropic_version is mandatory argument in the request body (rather than as a header), and must be set to the value vertex-2023-10-16. Refer these two links for additional details. Anthropic reference: https://docs.anthropic.com/en/api/claude-on-vertex-ai Vertex reference: https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude
vertex api request body
{
"anthropic_version": "vertex-2023-10-16",
"messages": [
{
"role": "ROLE",
"content": "CONTENT"
}],
"max_tokens": MAX_TOKENS,
"stream": STREAM,
"thinking": {
"type": "TYPE",
"budget_tokens": BUDGET_TOKENS
}
}
Curl request:
MODEL_ID="MODEL"
LOCATION="us-central1"
PROJECT_ID="PROJECT_ID"
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/anthropic/models/${MODEL_ID}:streamRawPredict -d \
'{
"anthropic_version": "vertex-2023-10-16",
"messages": [{
"role": "user",
"content": "Hello!"
}],
"max_tokens": 50,
"stream": true}'
Problem: https://github.com/Kong/kong/blob/master/kong/llm/drivers/anthropic.lua is transforming the request body and deleting anthropic_version from the request body. Refer https://github.com/Kong/kong/blob/master/kong/llm/drivers/anthropic.lua.
I attempted to construct that argument within the request body using the Request Transformer and AI Request Transformer plugins. However, it continues to be eliminated.
Expected Behavior
The Kong AI proxy should keep anthropic_version in the request body to ensure it functions as intended.
Steps To Reproduce
- Define AI proxy for the route type llm/v1/chat
- Define http route with path prefix /claude/chat
- Trigger /claude/chat with request body
{
"anthropic_version": "vertex-2023-10-16",
"messages": [{
"role": "user",
"content": "Hey Claude!"
}],
"max_tokens": 100
}
Anything else?
No response
@fffonion Could you help to check if this is by design, or an issue?
@fffonion really appreciate if you can take a look and guide. Thanks in advance.
Really appreciate if someone can take a look and guide us on resolving this. Thanks in advance.
We will look into this, thanks for the patience
Since anthropic_version is an extra field in OpenAI format, it got stripped which is correct. Normally, you would need to set anthropic_version in model.options, however, using normalized OpenAI request format to talk to anthropic models on Vertex is currently not supported, we are looking at a correct fix.
You can use the native incoming format (set config.llm_format to gemini) after this PR is merged https://github.com/Kong/kong/pull/14416.
Closing this as the mentioned PR is merged. If there's still problem please feel free to re-open.