openai-java icon indicating copy to clipboard operation
openai-java copied to clipboard

How to count tokens?

Open xujimu opened this issue 2 years ago • 4 comments

How to count tokens?

xujimu avatar Feb 18 '23 03:02 xujimu

This should be well covered in the OpenAI documentation not specific to this project.

https://platform.openai.com/tokenizer

By way of example:

Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 35 tokens.

https://openai.com/api/pricing/

[Tokens](https://platform.openai.com/docs/introduction/tokens)

Our models understand and process text by breaking it down into tokens. Tokens can be words or just chunks of characters. For example, the word “hamburger” gets broken up into the tokens “ham”, “bur” and “ger”, while a short and common word like “pear” is a single token. Many tokens start with a whitespace, for example “ hello” and “ bye”.

The number of tokens processed in a given API request depends on the length of both your inputs and outputs. As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text. One limitation to keep in mind is that your text prompt and generated completion combined must be no more than the model's maximum context length (for most models this is 2048 tokens, or about 1500 words). Check out our [tokenizer tool](https://platform.openai.com/tokenizer) to learn more about how text translates to tokens.

https://platform.openai.com/docs/introduction/overview

cryptoapebot avatar Feb 18 '23 15:02 cryptoapebot

may be u can use org.python.util.PythonInterpreter to call the python tool: https://github.com/openai/tiktoken (of cource the performance is not good.)

coldairance avatar Mar 03 '23 07:03 coldairance

the response also mention total token number: https://platform.openai.com/docs/api-reference/chat/create

coldairance avatar Mar 03 '23 08:03 coldairance

Also, OpenAI have a tokenizer tool. https://platform.openai.com/tokenizer

Inspecting the page looks like they call cdn.openai.com, but I can't determine the exact call from the app.

cryptoapebot avatar Mar 03 '23 16:03 cryptoapebot