litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Bug]: The meaning of "max_tokens" reported by /model/info is inconsistent

Open jeromeroussin opened this issue 1 year ago • 1 comments

What happened?

We've noticed that "max_tokens" (from https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json or https://github.com/BerriAI/litellm/blob/main/litellm/model_prices_and_context_window_backup.json) sometimes means max_input_tokens and sometimes max_output_tokens. We were internally relying on it to mean max_input_tokens. We'll switch to explicitely using max_input_tokens instead but it does seem odd that the meaning max_tokens is inconsistent.

Relevant log output

No response

Twitter / LinkedIn details

No response

jeromeroussin avatar Apr 04 '24 13:04 jeromeroussin

Continuing discussion from linkedin:

Hey @jeromeroussin would it be simpler if:

  • max_tokens = input + output token combined
  • max_input_tokens = max tokens you can put in
  • max_output_tokens = max tokens you can ask it to generate

in dev code, you'd probably need an if/else check:

  • if max_tokens == max_input_tokens: // leave some buffer for output tokens return max_tokens * 0.7
  • elif max_tokens == max_input + max_output_tokens: return max_input tokens since the decision re: buffer is probably implementation specific

?

krrishdholakia avatar Apr 06 '24 17:04 krrishdholakia

closing as not planned in favor of using max_input_tokens and max_output_tokens

Can revisit this though

krrishdholakia avatar May 08 '24 23:05 krrishdholakia