litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Feature]: support service_tier for o3 and o4-mini

Open alexzeitgeist opened this issue 8 months ago • 1 comments

The Feature

Hi,

Supporting the service_tier parameter ("flex") can help cut down costs by half using o3 or o4-mini.

See: https://platform.openai.com/docs/guides/flex-processing

Motivation, pitch

It helps to cut down costs by half, which is always nice :)

Are you a ML Ops Team?

No

Twitter / LinkedIn details

No response

alexzeitgeist avatar Apr 25 '25 07:04 alexzeitgeist

According to the document, any parameters not on this list are considered provider specific and will be passed directly to the LLM API.

So you can use this parameter directly. I have tried it, and it works fine.

ZeroClover avatar Apr 25 '25 14:04 ZeroClover

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] avatar Jul 25 '25 00:07 github-actions[bot]

Is there a way to add this parameter to the model endpoint stored in the litellm gateway through the UI? Meaning all calls would be sent with the service_tier parameter set?

zmweske avatar Oct 21 '25 04:10 zmweske