arcadia
arcadia copied to clipboard
support show stream token in API
Now the fastchat backend can't return the number of tokens consumed when in stream mode, and our API gateway needs this number for billing or metric.
Therefore, we intend to provide an API to count the number of generated texts.
Initial design is as follows:
- arcadia back-end will add one header key in resp
X-Request-ID
, a unique marker for each req - arcadia provides an unauthorized and unauthenticated GET API
/sum-tokens
to return to each requestID corresponds to the total number of characters in the resp. An example request is:
GET /sum-tokens?id=xxxx
{
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
cc @wojesen @nkwangleiGIT @bjwswang we need more discussion here.