litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Feature]: Add cost based routing

Open rlippmann opened this issue 1 year ago • 3 comments

The Feature

It would be nice for the router class to have an additional routing type of lowest cost.

Maybe also have a tie-breaker using the other scheduling mechanisms. This could be used for a routing strategy for locally hosted models (which would essentially have 0 cost).

Motivation, pitch

In order to reduce costs for inference, it would be nice to have this routing strategy.

Twitter / LinkedIn details

No response

rlippmann avatar Mar 20 '24 04:03 rlippmann

can you share an example router config for this? @rlippmann

krrishdholakia avatar Mar 20 '24 14:03 krrishdholakia

I was just thinking in addition to latency based, least busy, etc you could add a "lowest cost" routing strategy.

Something like:

Router(model_list=..., routing_strategy = "least-cost")

I guess you could also put your "free" models in a group, and use fallbacks, but adding a separate routing strategy for those might be good too, i.e.

Router(model_list=..., routing_strategy = "least-busy", 
    fallback_models=..., fallback_routing_strategy="usage-based-routing", context_window_fallbacks=..., 
    context_window_routing_strategy='least-cost')

rlippmann avatar Mar 21 '24 11:03 rlippmann

Allowing an order even within a model group makes a lot of sense now - bedrock pricing for mistral in paris is > mistral pricing in us-east / us-west

krrishdholakia avatar Apr 06 '24 16:04 krrishdholakia

Pr here: https://github.com/BerriAI/litellm/pull/3504 @rlippmann - I'd love to get on a call and learn how we can improve litellm for you. If you can share your email i'll send an invite

If it's easier here's our calendly: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat Linkedin for Dms: https://www.linkedin.com/in/reffajnaahsi/

ishaan-jaff avatar May 07 '24 20:05 ishaan-jaff