litellm icon indicating copy to clipboard operation
litellm copied to clipboard

[Bug]: Model name from completion response incompatible with completion_cost

Open dhruv-anand-aintech opened this issue 1 year ago • 7 comments

What happened?

messages = [{ "content": "What's the latest news about Israel-Palestine war?","role": "user"}]
comp_resp = litellm.completion(model="perplexity/pplx-7b-online", messages=messages)
litellm.completion_cost(comp_resp)

Relevant log output

{
	"name": "NotFoundError",
	"message": "Model not in model_prices_and_context_window.json. You passed model=pplx-7b-online
",
	"stack": "---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
Cell In[10], line 3
      1 messages = [{ \"content\": \"What's the latest news about Israel-Palestine war?\",\"role\": \"user\"}]
      2 comp_resp = litellm.completion(model=\"perplexity/pplx-7b-online\", messages=messages)
----> 3 litellm.completion_cost(comp_resp)

File ~/miniforge3/lib/python3.10/site-packages/litellm/utils.py:2889, in completion_cost(completion_response, model, prompt, messages, completion, total_time)
   2887     return prompt_tokens_cost_usd_dollar + completion_tokens_cost_usd_dollar
   2888 except Exception as e:
-> 2889     raise e

File ~/miniforge3/lib/python3.10/site-packages/litellm/utils.py:2882, in completion_cost(completion_response, model, prompt, messages, completion, total_time)
   2877     elif model in litellm.replicate_models or \"replicate\" in model:
   2878         return get_replicate_completion_pricing(completion_response, total_time)
   2879     (
   2880         prompt_tokens_cost_usd_dollar,
   2881         completion_tokens_cost_usd_dollar,
-> 2882     ) = cost_per_token(
   2883         model=model,
   2884         prompt_tokens=prompt_tokens,
   2885         completion_tokens=completion_tokens,
   2886     )
   2887     return prompt_tokens_cost_usd_dollar + completion_tokens_cost_usd_dollar
   2888 except Exception as e:

File ~/miniforge3/lib/python3.10/site-packages/litellm/utils.py:2801, in cost_per_token(model, prompt_tokens, completion_tokens)
   2798 else:
   2799     # if model is not in model_prices_and_context_window.json. Raise an exception-let users know
   2800     error_str = f\"Model not in model_prices_and_context_window.json. You passed model={model}\
\"
-> 2801     raise litellm.exceptions.NotFoundError(  # type: ignore
   2802         message=error_str,
   2803         model=model,
   2804         response=httpx.Response(
   2805             status_code=404,
   2806             content=error_str,
   2807             request=httpx.request(method=\"cost_per_token\", url=\"https://github.com/BerriAI/litellm\"),  # type: ignore
   2808         ),
   2809         llm_provider=\"\",
   2810     )

NotFoundError: Model not in model_prices_and_context_window.json. You passed model=pplx-7b-online
"
}

Twitter / LinkedIn details

No response

dhruv-anand-aintech avatar Jan 11 '24 08:01 dhruv-anand-aintech

@dhruv-anand-aintech I believe this is caused by the model being returned by perplexity not having their provider name, which is how we're storing it in the map.

Thanks for flagging this. Will fix

krrishdholakia avatar Jan 11 '24 11:01 krrishdholakia

can you let me know if this works

messages = [{ "content": "What's the latest news about Israel-Palestine war?","role": "user"}]
comp_resp = litellm.completion(model="perplexity/pplx-7b-online", messages=messages)
comp_resp.model = "perplexity/" + comp_resp.model
litellm.completion_cost(comp_resp)

krrishdholakia avatar Jan 11 '24 11:01 krrishdholakia

yes, it does

dhruv-anand-aintech avatar Jan 11 '24 11:01 dhruv-anand-aintech

We can make this even better now that we have custom_llm_provider in the ModelResponse: https://github.com/BerriAI/litellm/pull/1432

Which means, this should work - after some edits to completion_cost

messages = [{ "content": "What's the latest news about Israel-Palestine war?","role": "user"}]
comp_resp = litellm.completion(model="perplexity/pplx-7b-online", messages=messages)
litellm.completion_cost(comp_resp)

ishaan-jaff avatar Jan 13 '24 01:01 ishaan-jaff

Addressed here: @dhruv-anand-aintech https://github.com/BerriAI/litellm/pull/1439

ishaan-jaff avatar Jan 13 '24 20:01 ishaan-jaff

@dhruv-anand-aintech improvement is in prod: https://github.com/BerriAI/litellm/releases/tag/v1.17.5

This code snippet should just work - let me know if this works on 1.17.5 @dhruv-anand-aintech ?

messages = [{ "content": "What's the latest news about Israel-Palestine war?","role": "user"}]
comp_resp = litellm.completion(model="perplexity/pplx-7b-online", messages=messages)
litellm.completion_cost(comp_resp)

ishaan-jaff avatar Jan 13 '24 22:01 ishaan-jaff

bump @dhruv-anand-aintech can you confirm this is fixed for you ?

ishaan-jaff avatar Jan 18 '24 15:01 ishaan-jaff

works

dhruv-anand-aintech avatar Jan 31 '24 08:01 dhruv-anand-aintech

Actually, the cost calculation is wrong: https://github.com/BerriAI/litellm/blob/eb9eca6f3709cc894dc0542a4ecce5cffc2383de/model_prices_and_context_window.json#L1690-L1703

https://docs.perplexity.ai/docs/pricing The pricing has a $5/1000 requests component, and $0.28/1M output tokens. Your mapping only has $0.0005 per output token, which gives the wrong final cost.

dhruv-anand-aintech avatar Jan 31 '24 17:01 dhruv-anand-aintech

Thanks @dhruv-anand-aintech, i believe it's changed since we last checked. I'll update it now.

krrishdholakia avatar Jan 31 '24 21:01 krrishdholakia

fixed!

krrishdholakia avatar Jan 31 '24 21:01 krrishdholakia