jupyter-ai icon indicating copy to clipboard operation
jupyter-ai copied to clipboard

[v3.x.x] View per-request token counts

Open dlqqq opened this issue 2 years ago • 2 comments
trafficstars

Problem

A pervasive problem throughout the AI application space is the lack of billing transparency; most AI applications do not indicate the cost of previous messages, nor provide a preliminary estimate of the cost per LLM prompt prior to invocation.

Proposed Solution

Jupyter AI should be able to show an estimated bill per-request. We can estimate the number of tokens in a prompt (#225), and have billing estimate methods implemented per-provider. The billing estimate associated with a LLM response would then be included in the AiMessage objects streamed back to the client.

Additionally, users should be able to optionally configure Jupyter AI to ask for user confirmation when the billing estimate is generated, immediately prior to submission to a remote LLM.

Additional context

The most ambiguous task here is to decide on a good way of estimating token count, and whether we do that on a per-provider basis or use a global heuristic.

dlqqq avatar Aug 23 '23 15:08 dlqqq

We should prioritize token counting over billing estimation, to avoid conveying potentially false information about costs.

The token count may be provided by LangChain. This should be an opt-in feature, off by default.

JasonWeill avatar Aug 30 '23 18:08 JasonWeill

Hi @JasonWeill @dlqqq I'm the maintainer of LiteLLM https://github.com/BerriAI/litellm we allow you to do cost tracking for 100+ LLMs

Usage

Docs: https://docs.litellm.ai/docs/#calculate-costs-usage-latency

from litellm import completion, completion_cost
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
  model="gpt-3.5-turbo", 
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

cost = completion_cost(completion_response=response)
print("Cost for completion call with gpt-3.5-turbo: ", f"${float(cost):.10f}")

We also allow you to create a self hosted OpenAI Compatible proxy server to make your LLM calls (100+ LLMs), track costs, token usage Docs: https://docs.litellm.ai/docs/simple_proxy

I hope this is helpful, if not I'd love your feedback on what we can improve

ishaan-jaff avatar Dec 28 '23 12:12 ishaan-jaff