chatgpt-telegram-bot fix: estimated prompt tokens are not equal to api response

I found an issue with the prompt count calculation for GPT-3.5. On every call to __count_tokens() function, the returned value differs from the value returned by the API. The difference is always equal to the number of messages in the conversation. The most possible reason for that is that the condition if key == "name": never evaluates to True. It looks like the condition should be replaced with key == "role". I know, it comes from the openai-cookbook article, but my assumption the things are changed.

In addition, for the model GPT-4 the calculated count is always equal to the value returned by the API. So, we should not add tokens_per_name for this model.

This PR fixes prompt count calculation for GPT-3.5.

Detailed examples (with STREAM=false param) for 3.5 and 4 models before the fix are below . After this fix the values of calculated total and API response are equal.

gpt-3.5-turbo-0613

Chat msg	Conversation msg	After 1st msg	After 2nd msg
First question	{ role: "system",	1	1
	content: "You are a helpful assistant."}	6	6
	{ role: "user",	1	1
	content: "What's your name" }	4	4
Response	{ role: "assistant",		1
	content: "I am a helpful digital assistant and don't have a personal name. You can just call me "Assistant". How can I assist you today?" }		29
Second question	{ role: "user",		1
	content: "Is it OK?" }		4

Temporary total		12	47
Msg count		2	4
Per msg tokens		4	4
Per msg * msg count		8	16
Static add-on		3	3
Calculated total		23	66
prompt_tokens from API response		21	62

gpt-4-0613

Chat msg	Conversation msg	After 1st msg	After 2nd msg
First question	{ role: "system",	1	1
	content: "You are a helpful assistant."}	6	6
	{ role: "user",	1	1
	content: "What's your name" }	4	4
Response	{ role: "assistant",		1
	content: "I'm OpenAI, a virtual assistant here to help answer your questions and provide information. " }		18
Second question	{ role: "user",		1
	content: "Is it OK?" }		4

Temporary total		12	36
Msg count		2	4
Per msg tokens		3	3
Per msg * msg count		6	12
Static add-on		3	3
Calculated total		21	51
prompt_tokens from API response		21	51

Aug 18 '23 08:08 aiperon

Thanks @aiperon. I wonder if it would be worth opening an issue in the openai-cookbook repo?

Sep 11 '23 20:09 n3d1117

Thanks @aiperon. I wonder if it would be worth opening an issue in the openai-cookbook repo?

I believe it would so. If you approve my changes, I can create the pull request in their repo.

Sep 11 '23 20:09 aiperon

@aiperon Sorry, I'm a bit short on free time at the moment, so I won't be able to test your changes quickly. In the meantime please go ahead and open the PR in their repo! I'll keep this PR open for further updates

Sep 13 '23 21:09 n3d1117

chatgpt-telegram-bot chatgpt-telegram-bot copied to clipboard

fix: estimated prompt tokens are not equal to api response

gpt-3.5-turbo-0613

gpt-4-0613

chatgpt-telegram-bot
chatgpt-telegram-bot copied to clipboard