ai
ai copied to clipboard
feat: ability to define fallback model / provider
Feature Description
It would be helpful to be able to define "fallback" models / providers when building AI applications.
ie. if primary provider (x) errors out, revert to provider (y).
Use Case
No response
Additional context
No response
+1, need this as well.
I coded it myself in the meanwhile tho.
+1
+1
+1
+!
@lgrammel Any plan to implement this? I think it would be great to integrate it into the AI SDK.
@namanyayg keen to know what you did
@wong2 unsure when/how we are going to add this. the ideal solution imo requires a proxy layer bc you wan to monitor ai provider uptime across many requests in a potentially distributed system and switch more intelligently. the solution I would suggest in the interim is to use a custom middleware.
@lgrammel In my case, I don't need to monitor uptimes, we just need to fallback to another provider when the first request fails due to all kinds of reasons (eg. moderations fail, credits used up).
+1 to what @wong2 said
Until this is implemented officially, you can do this with https://github.com/remorses/ai-fallback
+1, need it as well
Ya this is a 100000% required as an official library pattern. In my opinion any custom solution (thanks for sharing @e-simpson) seems like it would error prone to version change drift.
An officially supported optional middleware would be amazing.
I think a great dx here would be able to stipulate what error codes signify a retry, and how many attempts for each model
@wong2 unsure when/how we are going to add this. the ideal solution imo requires a proxy layer bc you wan to monitor ai provider uptime across many requests in a potentially distributed system and switch more intelligently. the solution I would suggest in the interim is to use a custom middleware.
Perhaps for now there could just be a retry callback that lets the user handle the distributed failure flagging. I know personally, I monitor api status/uptime anyway within the app/db. My main issue here is when APIs return 429's or an object fails to generate and I want to switch to another model and retry.
+1 agree, essential for dealing with quota/availability issues
+1
need
+1
+1
+1
+1 -- claude being so unstable this is a must
+1 please
There's some support for fallback when using custom providers:
https://sdk.vercel.ai/docs/ai-sdk-core/provider-management
There's some support for fallback when using custom providers:
https://sdk.vercel.ai/docs/ai-sdk-core/provider-management
Ah this doesn't actually fallback on error: https://ai-sdk.dev/docs/reference/ai-sdk-core/custom-provider#fallback-provider
+1. this would also be really helpful in situations like this https://github.com/vercel/ai/issues/6589
Wrote a function for simple fallback support for the AI SDK 👇
export const fallback = async <T, H, G>(key: string, item: T[], genFunc: (args?: H) => Promise<G>) => {
let index = -1;
do {
try {
const resp = await genFunc(index >= 0 ? ({ [key]: item[index] } as H) : undefined);
return resp;
} catch (error) {
index++;
if (index >= item.length) {
throw error;
}
}
} while (index < item.length);
};
Example usage:
await fallback<LanguageModelV1, any, GenerateObjectResult<any>>("model", [google("gemini-1.5-flash-8b"), openai("gpt-4o-mini")], (args) =>
generateObject({
model: google("gemini-2.0-flash"),
prompt: preConfigPrompt,
schema: preConfigSchema,
temperature: 0,
...args,
})
)
References from this AI SDK: https://github.com/JigsawStack/omiai
Thanks @yoeven. This worked well for me 👍
I think streamText should add a retry callback to onError to be able to retry errors that happen during streaming, then pass the error object that is being retried in prepareStep so that you can implement fallback model logic
const models = [opeani('gpt-4.1-mini'), anthropic('claude-3-haiku-20240307')]
let currentModelIndex = 0
const res = streamText({
model: models[currentModelIndex],
onError: ({ error, retry }) => {
if (error?.message?.includes('Overloaded')) {
retry()
}
},
prepareStep({ model, error }) {
if (error) {
// cycle trough models on error
currentModelIndex += 1
currentModelIndex %= models.length
}
return {
model: models[currentModelIndex],
// only activate tools that work with current provider via activeTools
}
},
messages: [
{
role: 'user',
content:
'say "hello" 3 times with spaces. exactly that and nothing else',
},
],
})
Using prepareStep is essential because you need to be able to pass the activeTools that work for the current model provider, for example use webSearchPreview for OpenAI and webSearch_20250305 for Anthropic
@lgrammel
+1