gateway
gateway copied to clipboard
A Blazing Fast AI Gateway with integrated Guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.
Portkey's AI Gateway is the interface between your app and hosted LLMs. It streamlines API requests to OpenAI, Anthropic, Mistral, LLama2, Anyscale, Google Gemini and more with a unified API.
✅ Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Plug-in middleware as needed
✅ Battle tested over 100B tokens
Getting Started
Installation
If you're familiar with Node.js and npx
, you can run your private AI gateway locally. (Other deployment options)
npx @portkey-ai/gateway
Your AI Gateway is now running on http://localhost:8787 🚀
Usage
Let's try making a chat completions call to OpenAI through the AI gateway:
curl '127.0.0.1:8787/v1/chat/completions' \
-H 'x-portkey-provider: openai' \
-H "Authorization: Bearer $OPENAI_KEY" \
-H 'Content-Type: application/json' \
-d '{"messages": [{"role": "user","content": "Say this is test."}], "max_tokens": 20, "model": "gpt-4"}'
Full list of supported SDKs
Supported Providers
Provider | Support | Stream | Supported Endpoints | |
---|---|---|---|---|
![]() |
OpenAI | ✅ | ✅ | /completions , /chat/completions ,/embeddings , /assistants , /threads , /runs |
![]() |
Azure OpenAI | ✅ | ✅ | /completions , /chat/completions ,/embeddings |
![]() |
Anyscale | ✅ | ✅ | /chat/completions |
![]() |
Google Gemini & Palm | ✅ | ✅ | /generateMessage , /generateText , /embedText |
![]() |
Anthropic | ✅ | ✅ | /messages , /complete |
![]() |
Cohere | ✅ | ✅ | /generate , /embed , /rerank |
Together AI | ✅ | ✅ | /chat/completions , /completions , /inference |
|
Perplexity | ✅ | ✅ | /chat/completions |
|
Mistral | ✅ | ✅ | /chat/completions , /embeddings |
Features
Unified API SignatureConnect with 100+ LLM using OpenAI's API signature. The AI gateway handles the request, response and error transformations so you don't have to make any changes to your code. You can use the OpenAI SDK itself to connect to any of the supported LLMs.![]() ![]() ![]() ![]() ![]() ![]() ![]() |
FallbackDon't let failures stop you. The Fallback feature allows you to specify a list of Language Model APIs (LLMs) in a prioritized order. If the primary LLM fails to respond or encounters an error, Portkey will automatically fallback to the next LLM in the list, ensuring your application's robustness and reliability.![]() |
Automatic RetriesTemporary issues shouldn't mean manual re-runs. AI Gateway can automatically retry failed requests upto 5 times. We apply an exponential backoff strategy, which spaces out retry attempts to prevent network overload. |
Load BalancingDistribute load effectively across multiple API keys or providers based on custom weights. This ensures high availability and optimal performance of your generative AI apps, preventing any single LLM from becoming a performance bottleneck.![]() |
Configuring the AI Gateway
The AI gateway supports configs to enable versatile routing strategies like fallbacks, load balancing, retries and more.
You can use these configs while making the OpenAI call through the x-portkey-config
header
// Using the OpenAI JS SDK
const client = new OpenAI({
baseURL: "http://127.0.0.1:8787", // The gateway URL
defaultHeaders: {
'x-portkey-config': {.. your config here ..},
}
});
Here's an example config that retries an OpenAI request 5 times before falling back to Gemini Pro
{
"retry": { "count": 5 },
"strategy": { "mode": "fallback" },
"targets": [{
"provider": "openai",
"api_key": "sk-***"
},{
"provider": "google",
"api_key": "gt5***",
"override_params": {"model": "gemini-pro"}
}]
}
This config would enable load balancing equally between 2 OpenAI keys
{
"strategy": { "mode": "loadbalance" },
"targets": [{
"provider": "openai",
"api_key": "sk-***",
"weight": "0.5"
},{
"provider": "openai",
"api_key": "sk-***",
"weight": "0.5"
}
]
}
Read more about the config object.
Supported SDKs
Language | Supported SDKs |
---|---|
Node.js / JS / TS | Portkey SDK OpenAI SDK LangchainJS LlamaIndex.TS |
Python | Portkey SDK OpenAI SDK Langchain LlamaIndex |
Go | go-openai |
Java | openai-java |
Rust | async-openai |
Ruby | ruby-openai |
Deploying AI Gateway
See docs on installing the AI Gateway locally or deploying it on popular locations.
Roadmap
- Support for more providers. Missing a provider or LLM Platform, raise a feature request.
- Enhanced load balancing features to optimize resource use across different models and providers.
- More robust fallback and retry strategies to further improve the reliability of requests.
- Increased customizability of the unified API signature to cater to more diverse use cases.
💬 Participate in Roadmap discussions here.
Contributing
The easiest way to contribute is to pick any issue with the good first issue
tag 💪. Read the Contributing guidelines here.
Bug Report? File here | Feature Request? File here
Community
Join our growing community around the world, for help, ideas, and discussions on AI.
- View our official Blog
- Chat live with us on Discord
- Follow us on Twitter
- Connect with us on LinkedIn