guidance Tracking Token Usage

Is your feature request related to a problem? Please describe. I need to keep track of my various client's token usage. The OpenAI output has completetion_tokens in the following format: “prompt_tokens”: 123, “completion_tokens”: 55, “total_tokens”: 178

And it would be super cool if those could be in the output variables.

Describe the solution you'd like program()["completion_tokens"] = 81 etc.

Describe alternatives you've considered Tokenizing the input and output. Could not find a way to get the raw input and output going to OpenAI to count the tokens.

Additional context

Jul 25 '23 20:07 TaylorAndStubbs

any updates?

Aug 02 '23 09:08 williambrach

I need this as well.

Aug 03 '23 09:08 lcp-lchilds

Great suggestion! Out of curiosity, if guidance needs to make multiple API calls across a program, would you prefer a granular per-call breakdown or just a simple aggregation of all the calls?

Aug 08 '23 18:08 Harsha-Nori

I would prefer a per-call breakdown since I could always aggregate it later if I needed to. Would asking for both be feasible?

Aug 08 '23 18:08 TaylorAndStubbs

I would like to have this feature as well. I tried to see if I could add it myself, but to my surprise in chat complete streaming mode, when I print the results of the openai response, the 'usage' object is missing.

So then I tried a second approach (kind of bad): calculating the tokens using the prompt and final response (probably not what everyone wants, but would work for me) but I was unable to figure out how to return the data in streaming.

Any hints on how to do this?

Aug 10 '23 23:08 fullstackwebdev

+1, would love to also be able to get the prompt_tokens prior to executing, so that we can detect how many tokens the completion could have (for error handling)

Aug 11 '23 13:08 mjedmonds

100% agree that this would be super useful. As a real-world example, I have written a token wallet that allocates and keeps track of token usage per user. It would be incredibly useful if I had a callback that is triggered before and after each new LLM call. This way I could withdraw the tokens from the token wallet at each step and bail the execution early if the wallet is empty.