guidance icon indicating copy to clipboard operation
guidance copied to clipboard

Tracking Token Usage

Open TaylorAndStubbs opened this issue 1 year ago • 11 comments

Is your feature request related to a problem? Please describe. I need to keep track of my various client's token usage. The OpenAI output has completetion_tokens in the following format: “prompt_tokens”: 123, “completion_tokens”: 55, “total_tokens”: 178

And it would be super cool if those could be in the output variables.

Describe the solution you'd like program()["completion_tokens"] = 81 etc.

Describe alternatives you've considered Tokenizing the input and output. Could not find a way to get the raw input and output going to OpenAI to count the tokens.

Additional context

TaylorAndStubbs avatar Jul 25 '23 20:07 TaylorAndStubbs

any updates?

williambrach avatar Aug 02 '23 09:08 williambrach

I need this as well.

lcp-lchilds avatar Aug 03 '23 09:08 lcp-lchilds

Great suggestion! Out of curiosity, if guidance needs to make multiple API calls across a program, would you prefer a granular per-call breakdown or just a simple aggregation of all the calls?

Harsha-Nori avatar Aug 08 '23 18:08 Harsha-Nori

I would prefer a per-call breakdown since I could always aggregate it later if I needed to. Would asking for both be feasible?

TaylorAndStubbs avatar Aug 08 '23 18:08 TaylorAndStubbs

I would like to have this feature as well. I tried to see if I could add it myself, but to my surprise in chat complete streaming mode, when I print the results of the openai response, the 'usage' object is missing.

So then I tried a second approach (kind of bad): calculating the tokens using the prompt and final response (probably not what everyone wants, but would work for me) but I was unable to figure out how to return the data in streaming.

Any hints on how to do this?

fullstackwebdev avatar Aug 10 '23 23:08 fullstackwebdev

+1, would love to also be able to get the prompt_tokens prior to executing, so that we can detect how many tokens the completion could have (for error handling)

mjedmonds avatar Aug 11 '23 13:08 mjedmonds

100% agree that this would be super useful. As a real-world example, I have written a token wallet that allocates and keeps track of token usage per user. It would be incredibly useful if I had a callback that is triggered before and after each new LLM call. This way I could withdraw the tokens from the token wallet at each step and bail the execution early if the wallet is empty.

jscheel avatar Aug 23 '23 01:08 jscheel

Bump. This would be very useful to have.

MichaelOwenDyer avatar Sep 01 '23 18:09 MichaelOwenDyer

+1, would be mega-useful!

In the meantime, I've just modified the "OpenAIResponse" class in "openai_response.py" by adding print(data['usage']) at the end 😅

NickSmet avatar Sep 24 '23 00:09 NickSmet

+1 here too, i'm trying to track latency and token counts to compare different prompts guidance programs

jan-ninja avatar Oct 03 '23 01:10 jan-ninja

Has there been any progress?

I skimmed pull requests and found nothing.

avion23 avatar Apr 01 '24 08:04 avion23