[Feature]: support service_tier for o3 and o4-mini

Open alexzeitgeist opened this issue 8 months ago • 1 comments

The Feature

Hi,

Supporting the service_tier parameter ("flex") can help cut down costs by half using o3 or o4-mini.

See: https://platform.openai.com/docs/guides/flex-processing

Motivation, pitch

It helps to cut down costs by half, which is always nice :)

Are you a ML Ops Team?

Twitter / LinkedIn details

No response

Apr 25 '25 07:04 alexzeitgeist

According to the document, any parameters not on this list are considered provider specific and will be passed directly to the LLM API.

So you can use this parameter directly. I have tried it, and it works fine.

Apr 25 '25 14:04 ZeroClover

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Jul 25 '25 00:07 github-actions[bot]

Is there a way to add this parameter to the model endpoint stored in the litellm gateway through the UI? Meaning all calls would be sent with the service_tier parameter set?

Oct 21 '25 04:10 zmweske