mirascope icon indicating copy to clipboard operation
mirascope copied to clipboard

Track costs for streaming with Cohere

Open brenkao opened this issue 9 months ago • 1 comments

Is your feature request related to a problem? Please describe. Many providers are starting to add usage to streaming. This makes it much easier for Mirascope to calculate cost.

Describe the solution you'd like Add a total_cost property to CohereCallResponseChunk. Read the "event_type": "stream-end" sent by Cohere API and calculate cost using

"token_count": {
    "prompt_tokens": ...,
    "response_tokens": ...,
    "total_tokens": ...,
    "billed_tokens": ...,
}

Update https://github.com/Mirascope/mirascope/blob/dev/mirascope/cohere/utils.py as necessary.

brenkao avatar May 07 '24 23:05 brenkao

See #214 since these are related.

Namely: https://github.com/Mirascope/mirascope/issues/214#issuecomment-2098893697

willbakst avatar May 08 '24 16:05 willbakst

Is your feature request related to a problem? Please describe. Many providers are starting to add usage to streaming. This makes it much easier for Mirascope to calculate cost.

Describe the solution you'd like Add a total_cost property to CohereCallResponseChunk. Read the "event_type": "stream-end" sent by Cohere API and calculate cost using

"token_count": {
    "prompt_tokens": ...,
    "response_tokens": ...,
    "total_tokens": ...,
    "billed_tokens": ...,
}

Update https://github.com/Mirascope/mirascope/blob/dev/mirascope/cohere/utils.py as necessary.

I am working on this but the problem I am facing is that event returned by co.chat_stream() is of type StreamedChatResponse and it's response property is of type NonStreamedChatResponse which does not have token_count property in it. I am not sure how do I access the token_count here.

tvj15 avatar May 31 '24 00:05 tvj15

Doesn't the NonStreamedChatResponse type have response.meta.billed_units, which return ApiMetaBilledUnits from which we should be able to grab the same usage statistics that we do for the normal response? We can likely massage that data into the form we need to calculate cost, right?

willbakst avatar May 31 '24 03:05 willbakst

Doesn't the NonStreamedChatResponse type have response.meta.billed_units, which return ApiMetaBilledUnits from which we should be able to grab the same usage statistics that we do for the normal response? We can likely massage that data into the form we need to calculate cost, right?

Yes, it does, but according to the API docs, the streamed response has no meta.billed_units property. I does have token_count though. I can look again at what is happening on the API side and update here.

tvj15 avatar Jun 01 '24 23:06 tvj15

This is partially implemented with #307 where Cohere chunks will contain input_tokens and output_tokens which can be used to calculate cost. The only thing remaining that will need to be done is to pass cost into CohereCallResponseChunk.

brenkao avatar Jun 07 '24 00:06 brenkao