jupyter-ai
jupyter-ai copied to clipboard
[v3.x.x] View per-request token counts
Problem
A pervasive problem throughout the AI application space is the lack of billing transparency; most AI applications do not indicate the cost of previous messages, nor provide a preliminary estimate of the cost per LLM prompt prior to invocation.
Proposed Solution
Jupyter AI should be able to show an estimated bill per-request. We can estimate the number of tokens in a prompt (#225), and have billing estimate methods implemented per-provider. The billing estimate associated with a LLM response would then be included in the AiMessage objects streamed back to the client.
Additionally, users should be able to optionally configure Jupyter AI to ask for user confirmation when the billing estimate is generated, immediately prior to submission to a remote LLM.
Additional context
The most ambiguous task here is to decide on a good way of estimating token count, and whether we do that on a per-provider basis or use a global heuristic.
We should prioritize token counting over billing estimation, to avoid conveying potentially false information about costs.
The token count may be provided by LangChain. This should be an opt-in feature, off by default.
Hi @JasonWeill @dlqqq I'm the maintainer of LiteLLM https://github.com/BerriAI/litellm we allow you to do cost tracking for 100+ LLMs
Usage
Docs: https://docs.litellm.ai/docs/#calculate-costs-usage-latency
from litellm import completion, completion_cost
import os
os.environ["OPENAI_API_KEY"] = "your-api-key"
response = completion(
model="gpt-3.5-turbo",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)
cost = completion_cost(completion_response=response)
print("Cost for completion call with gpt-3.5-turbo: ", f"${float(cost):.10f}")
We also allow you to create a self hosted OpenAI Compatible proxy server to make your LLM calls (100+ LLMs), track costs, token usage Docs: https://docs.litellm.ai/docs/simple_proxy
I hope this is helpful, if not I'd love your feedback on what we can improve