prompt2model icon indicating copy to clipboard operation
prompt2model copied to clipboard

disallowed_special

Open zhaochenyang20 opened this issue 1 year ago • 1 comments

I encountered a strange bug of tiktoken. Basically, we need to change our count_tokens_from_string function:

def count_tokens_from_string(string: str, encoding_name: str = "cl100k_base") -> int:
    """Handle count the tokens in a string with OpenAI's tokenizer.

    Args:
        string: The string to count.
        encoding_name: The name of the tokenizer to use.

    Returns:
        The number of tokens in the string.
    """
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string, disallowed_special=()))
    return num_tokens

zhaochenyang20 avatar Nov 07 '23 03:11 zhaochenyang20