ai feat: ability to define fallback model / provider

Feature Description

It would be helpful to be able to define "fallback" models / providers when building AI applications.

ie. if primary provider (x) errors out, revert to provider (y).

Use Case

No response

Additional context

No response

Aug 12 '24 13:08 nicoalbanese

+1, need this as well.

I coded it myself in the meanwhile tho.

Aug 20 '24 13:08 namanyayg

+1

Aug 26 '24 01:08 JimmyLv

+1

Sep 24 '24 13:09 sylviangth

+1

Sep 25 '24 07:09 john-dw

+!

Sep 26 '24 19:09 aviramroi

@lgrammel Any plan to implement this? I think it would be great to integrate it into the AI SDK.

Oct 24 '24 08:10 wong2

@namanyayg keen to know what you did

Nov 03 '24 20:11 sahanatvessel

@wong2 unsure when/how we are going to add this. the ideal solution imo requires a proxy layer bc you wan to monitor ai provider uptime across many requests in a potentially distributed system and switch more intelligently. the solution I would suggest in the interim is to use a custom middleware.

Nov 04 '24 08:11 lgrammel

@lgrammel In my case, I don't need to monitor uptimes, we just need to fallback to another provider when the first request fails due to all kinds of reasons (eg. moderations fail, credits used up).

Nov 08 '24 07:11 wong2

+1 to what @wong2 said

Jan 02 '25 17:01 p3droml

Until this is implemented officially, you can do this with https://github.com/remorses/ai-fallback

Jan 08 '25 02:01 e-simpson

+1, need it as well

Jan 28 '25 23:01 antonlvovych

Ya this is a 100000% required as an official library pattern. In my opinion any custom solution (thanks for sharing @e-simpson) seems like it would error prone to version change drift.

An officially supported optional middleware would be amazing.

Feb 01 '25 14:02 sheldonj

Duplicate issue

Feb 02 '25 19:02 e-simpson

I think a great dx here would be able to stipulate what error codes signify a retry, and how many attempts for each model

@wong2 unsure when/how we are going to add this. the ideal solution imo requires a proxy layer bc you wan to monitor ai provider uptime across many requests in a potentially distributed system and switch more intelligently. the solution I would suggest in the interim is to use a custom middleware.

Perhaps for now there could just be a retry callback that lets the user handle the distributed failure flagging. I know personally, I monitor api status/uptime anyway within the app/db. My main issue here is when APIs return 429's or an object fails to generate and I want to switch to another model and retry.

Feb 02 '25 23:02 e-simpson

+1 agree, essential for dealing with quota/availability issues

Feb 09 '25 02:02 agoldstein03

+1

Mar 19 '25 13:03 joaopcm

need

Mar 22 '25 21:03 trancethehuman

+1

Mar 22 '25 21:03 tgonzales

+1

Mar 25 '25 15:03 CarlosPProjects

+1

Mar 28 '25 16:03 sweetmantech

+1 -- claude being so unstable this is a must

Apr 01 '25 14:04 janvm18

+1 please

Apr 01 '25 14:04 AvM136

There's some support for fallback when using custom providers:

https://sdk.vercel.ai/docs/ai-sdk-core/provider-management

Apr 17 '25 17:04 elie222

There's some support for fallback when using custom providers:

https://sdk.vercel.ai/docs/ai-sdk-core/provider-management

Ah this doesn't actually fallback on error: https://ai-sdk.dev/docs/reference/ai-sdk-core/custom-provider#fallback-provider

May 16 '25 07:05 elie222

+1. this would also be really helpful in situations like this https://github.com/vercel/ai/issues/6589

Jun 03 '25 01:06 yemyat

Wrote a function for simple fallback support for the AI SDK 👇

export const fallback = async <T, H, G>(key: string, item: T[], genFunc: (args?: H) => Promise<G>) => {
  let index = -1;
  do {
    try {
      const resp = await genFunc(index >= 0 ? ({ [key]: item[index] } as H) : undefined);
      return resp;
    } catch (error) {
      index++;
      if (index >= item.length) {
        throw error;
      }
    }
  } while (index < item.length);
};

Example usage:

  await fallback<LanguageModelV1, any, GenerateObjectResult<any>>("model", [google("gemini-1.5-flash-8b"), openai("gpt-4o-mini")], (args) =>
    generateObject({
      model: google("gemini-2.0-flash"),
      prompt: preConfigPrompt,
      schema: preConfigSchema,
      temperature: 0,
      ...args,
    })
  )

References from this AI SDK: https://github.com/JigsawStack/omiai

Jun 22 '25 17:06 yoeven

Thanks @yoeven. This worked well for me 👍

Jun 25 '25 20:06 mulhoon

I think streamText should add a retry callback to onError to be able to retry errors that happen during streaming, then pass the error object that is being retried in prepareStep so that you can implement fallback model logic


const models = [opeani('gpt-4.1-mini'), anthropic('claude-3-haiku-20240307')]
let currentModelIndex = 0
const res = streamText({
    model: models[currentModelIndex],
    onError: ({ error, retry }) => {
        if (error?.message?.includes('Overloaded')) {
            retry()
        }
    },
    prepareStep({ model, error }) {
        if (error) {
            // cycle trough models on error
            currentModelIndex += 1
            currentModelIndex %= models.length
        }
        return {
            model: models[currentModelIndex],
            // only activate tools that work with current provider via activeTools
        }
    },
    messages: [
        {
            role: 'user',
            content:
                'say "hello" 3 times with spaces. exactly that and nothing else',
        },
    ],
})

Using prepareStep is essential because you need to be able to pass the activeTools that work for the current model provider, for example use webSearchPreview for OpenAI and webSearch_20250305 for Anthropic

@lgrammel

Jul 25 '25 11:07 remorses

+1

Aug 18 '25 18:08 xdannyrobertsx

ai ai copied to clipboard

feat: ability to define fallback model / provider

Feature Description

Use Case

Additional context

ai
ai copied to clipboard