gpt-engineer
gpt-engineer copied to clipboard
Print and store how many tokens were used in memory/logs
In this way, we can also store this to benchmark results.
A huge increase in tokens will not be worth a minor improvement in benchmark resultss.
this was exactly what i was trying to decipher today. unfortunately the information online on OpenAI Usage retrieval is quite limited
tried looking for some solutions that worked but on Langchain (i.e need some memory)
def run_chain(k, max_tokens, model_name, docs, user_question): llm = ChatOpenAI(model_name=model_name, temperature=0, max_tokens=max_tokens) chain = load_qa_chain(llm, chain_type="stuff") with get_openai_callback() as cb: response = chain.run(input_documents=docs[:k], question=user_question) print(cb) rounded_cost = extract_and_round_cost(cb) st.write("Using " + model_name + ", " + f"${rounded_cost}") st.write("---") return response
def extract_and_round_cost(cb): cb_str = str(cb) cost_line = re.search(r"Total Cost (USD): $(.+)", cb_str)
if cost_line:
cost_str = cost_line.group(1)
cost = decimal.Decimal(cost_str)
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
rounded_cost = round(cost, 3)
return float(rounded_cost)
this basically prints the token usage and calculates costs.
output:
Tokens Used: 560 Prompt Tokens: 377 Completion Tokens: 183 Successful Requests: 1 Total Cost (USD): $0.0009315
not sure if this is what you are thinking about
@yitron when and @AntonOsika I looked upon your suggestion, tiktoken a module by OpenAI is a fast BPE tokeniser for use with OpenAI's models and the below code will print the code and store the value in a file token_count_log.txt which we can further plot graphs for benchtesting the results.
import tiktoken import re import decimal
def run_chain(k, max_tokens, model_name, docs, user_question): llm = ChatOpenAI(model_name=model_name, temperature=0, max_tokens=max_tokens) chain = load_qa_chain(llm, chain_type="stuff") with get_openai_callback() as cb: response = chain.run(input_documents=docs[:k], question=user_question) print(cb) rounded_cost = extract_and_round_cost(cb) st.write("Using " + model_name + ", " + f"${rounded_cost}") st.write("---")
# Count tokens in the response
token_count = tiktoken.count(response)
print("Token count:", token_count)
# Store token count in logs
log_token_count(token_count)
return response
def extract_and_round_cost(cb): cb_str = str(cb) cost_line = re.search(r"Total Cost (USD): $(.+)", cb_str)
if cost_line:
cost_str = cost_line.group(1)
cost = decimal.Decimal(cost_str)
decimal.getcontext().rounding = decimal.ROUND_HALF_UP
rounded_cost = round(cost, 3)
return float(rounded_cost)
def log_token_count(token_count): with open("token_count_log.txt", "a") as f: f.write(str(token_count) + "\n")
An example of the output in the log file and console :
Callback Object: <Callback object at 0x7f8a12b65470> Using GPT-3.5, $0.035 Token count: 87
@AntonOsika would love to work on this issue, assign it to me
@shubham-attri just fork the repo, make a change and submit a pr. that's the common way for open-source projects. :)
If you're not working on it, I'd quickly do the pr. Let me know!
Oops just saw this! Is this assigned ? If not I can work on it..
Hey, I now just went ahead an implemented it
@yitron in open-source projects if an issue is not assigned, you can just do it. Don't wait for permission. ;)
@shubham-attri @yitron Feel free to ping me if you have any questions on contributing to open source!
Hey got it! Just thinking (and not verifying) if it's done haha sorry!
Great job 🚀