FastChat
FastChat copied to clipboard
Support stream_options for openAI API
According to openAI doc: https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options. The API provide the stream_options
which can get token usage info for stream request. Please support this option for better rate-limit control
My alternative approach here is to use the AutoTokenizer
from the transformers
library to calculate the number of tokens.
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('model....')
len(tokenizer.tokenize(content))