openai-java 这个库在chat模型的token计算上有诸多错误

这个库在chat模型的token计算上有诸多错误

Open CodeInDreams opened this issue 1 year ago • 4 comments

与openai返回的token对比发现，几乎各chat模型都有计算方式错误或结果偏差，于是我自己从零建模和编写了token计算工具

你的部分token计算代码有严重错误，这里列举部分：

由于精力有限我无法在开源代码上提交修改，本issue只是告知绝大部分token计算都有误，请你自己有精力时研究下

Dec 11 '23 02:12 CodeInDreams

感谢，的确不能保证和官方返回一模一样。

可以看到官方的cookbookHow_to_count_tokens_with_tiktoken中也是提到

也是一个预估值，这个计算方法主要是用来预估发送的一些limit，所以有一点误差不会影响逻辑。

Dec 11 '23 05:12 forestwanglin

token计算不支持 gpt-3.5-turbo-0301

Dec 11 '23 09:12 forestwanglin

我按照官方Demo的计算规则，更新了计算方法。已经测试的：

更新了funciton的计算规则，具体可参考FunctionFormat

Dec 12 '23 11:12 forestwanglin

More issue about token calculation can be found issue #4.

Jan 30 '24 04:01 forestwanglin