litellm
litellm copied to clipboard
feat(bedrock.py): Add Cloudflare AI Gateway support
Title
This adds Cloudflare AI Gateway support for Bedrock.
Relevant issues
Resolves #1040.
Type
🆕 New Feature 🚄 Infrastructure
Changes
We add a boto3 hook to modify the URL after signing, but before invoking.
Testing
model_list:
- model_name: claude-3-haiku-20240307
litellm_params:
model: bedrock/anthropic.claude-3-haiku-20240307-v1:0
aws_region_name: us-east-1
max_tokens: 4096
aws_bedrock_runtime_endpoint: https://gateway.ai.cloudflare.com/v1/ACCOUNT_ID_HERE/GATEWAY_ID_HERE/aws-bedrock/bedrock-runtime/us-east-1
curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "claude-3-haiku-20240307",
"max_tokens": 100,
"temperature": 1.0,
"messages": [
{
"role": "user",
"content": "Tell a joke."
}
]
}'
Notes
Cloudflare AI Gateway seems to break support for streaming at the moment, not sure why.
curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "claude-3-haiku-20240307",
"max_tokens": 100,
"temperature": 1.0,
"stream": true,
"messages": [
{
"role": "user",
"content": "Tell a joke."
}
]
}'
data: {"error": {"message": "Header length of 3216834560 exceeded the maximum of 131072\n\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.11/site-packages/litellm/proxy/proxy_server.py\", line 3159, in async_data_generator\n async for chunk in response:\n File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 10971, in __anext__\n raise e\n File \"/usr/local/lib/python3.11/site-packages/litellm/utils.py\", line 10903, in __anext__\n chunk = next(self.completion_stream)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 602, in __iter__\n for event in self._event_generator:\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 611, in _create_raw_event_generator\n yield from event_stream_buffer\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 544, in __next__\n return self.next()\n ^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 536, in next\n self._prelude = self._parse_prelude()\n ^^^^^^^^^^^^^^^^^^^^^\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 480, in _parse_prelude\n self._validate_prelude(prelude)\n File \"/usr/local/lib/python3.11/site-packages/botocore/eventstream.py\", line 471, in _validate_prelude\n raise InvalidHeadersLength(prelude.headers_length)\nbotocore.eventstream.InvalidHeadersLength: Header length of 3216834560 exceeded the maximum of 131072\n", "type": "None", "param": "None", "code": 500}}
Pre-Submission Checklist (optional but appreciated):
- [ ] I have included relevant documentation updates (stored in /docs/my-website)
OS Tests (optional but appreciated):
- [ ] Tested on Windows
- [ ] Tested on MacOS
- [X] Tested on Linux
The latest updates on your projects. Learn more about Vercel for Git ↗︎
Name | Status | Preview | Comments | Updated (UTC) |
---|---|---|---|---|
litellm | ✅ Ready (Inspect) | Visit Preview | 💬 Add feedback | May 7, 2024 10:51am |
please add a test for this @Manouchehri - want to make sure no regressions occur. I believe you can mock bedrock calls
I added a test, let me know if it fails. =)
cd litellm/tests/
poetry run pytest test_bedrock_completion.py::test_completion_bedrock_cloudflare_ai_gateway -s -v
Confirmed, it passes. =)