litellm icon indicating copy to clipboard operation
litellm copied to clipboard

add max tokens to health check

Open wallies opened this issue 1 year ago • 1 comments

Title

health check should only need a minimal token completion call

Relevant issues

Type

🐛 Bug Fix

Changes

With the health check endpoint we really only care if we get a 200 status back not the output, so its best to send max_tokens, otherwise as we have seen, sending a health check every 300 seconds can get very expensive for output tokens.

wallies avatar Dec 03 '24 12:12 wallies

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 3, 2024 0:32am

vercel[bot] avatar Dec 03 '24 12:12 vercel[bot]

@krrishdholakia can you take a look at this PR please? The health check ended up costing us a lot of money when its just looking at the return not content.

wallies avatar Dec 10 '24 12:12 wallies

Makes sense. And sorry for that. Yes I'll merge this in today

krrishdholakia avatar Dec 10 '24 15:12 krrishdholakia