openai-function-tokens
openai-function-tokens copied to clipboard
Suggested functionality: Estimate by model_type
First off: Great tool and saved me the headache of trying to trace the functions tokens myself. A final touch could to introduce an option to have a token estimator class (tokenizer class?) which gets the model type as attribute and then uses the tiktoken.encoding_for_model() function to retrieve the encoding.
That way if openai ever changes the encoding or uses a different encoding for newer models the package can stay up to date. On a side note what I think is also useful are following functions which you can use e.g. to prevent logging of huge inputs to the model
def get_string_tokens(self, the_str : str) -> int:
return len(self.encode(the_str))
def get_limited_string(self, the_str : str, max_tokens : int) -> str:
encoded_str = self.encode(the_str)
return self.decode(encoded_str[:max_tokens])
Best Somerandomguy10111
If I get around to it I will implement it and pull request it myself