cohere-python icon indicating copy to clipboard operation
cohere-python copied to clipboard

Properly return usage information for BedrockClientV2 for the Reranker model?

Open ldorigo opened this issue 10 months ago • 2 comments

It would be very useful to have a number of "used billable units" since now I need to implement some complicated (and cpu-intensive) logic to tokenize the queries, tokenize each document; and count how many billable requests will be used by a given request based on the amount and length of documents.

ldorigo avatar Feb 13 '25 12:02 ldorigo

Hi @billytrend-cohere , if at all possible, could you indicate whether this is planned and perhaps give a rough ETA? We found that doing the tokenization ourselves is indeed quite resource-intensive at scale (and increased memory usage quite significantly); I need to decide whether it makes sense to try and optimize it on our end or wait for it to be supported.

ldorigo avatar Mar 11 '25 14:03 ldorigo

Hey @ldorigo Looking into this for you! you can use our platform to do free tokenisation if you need to be unblocked sooner. just set offline=False on the tokenise call

billytrend-cohere avatar Mar 19 '25 15:03 billytrend-cohere