[FR]: Support Openai TTS models tracking
Proposal summary
Currently, while using OpenAI TTS models, itβs not possible to track the cost and usage of the api using the Openai Integration, for example we are using audio.speech.with_streaming_response.create in order to perform the Text-To-Speech. Would you have this in the roadmap or would it be possible to add it? Thank you in advance
Motivation
Tracing voice models is not straightforward, and we have current limitations doing this with Openai Realtime and Text-To-Speech models
/bounty $200
π $200 bounty β’ Comet
Steps to solve:
- Read Contributing Docs: See contributing guide and read on how to setup Opik and contribute to various parts of the code base.
- Start working: Comment
/attempt #2202with your implementation plan - Submit work: Create a pull request including
/claim #2202in the PR body to claim the bounty - Review: Team will review PR and any clarifying questions and if successful changes will be merged
- Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
β Important guidelines:
- Do NOT start multiple bounties
- To claim a bounty, you need to provide a short demo video of your changes in your pull request
- If anything is unclear, ask for clarification before starting as this will help avoid potential rework
- Low quality AI PRs will not receive review and will be closed
- Please ask to be assigned before attempting to work on the bounty
Thank you for contributing to Comet!
| Attempt | Started (UTC) | Solution | Actions |
|---|---|---|---|
| π’ @vishalpatil1899 | Jun 19, 2025, 09:47:56 PM | WIP | |
| π’ @Gmin2 | Jun 23, 2025, 07:30:03 AM | #2547 | Reward |
| π’ @vladimirrotariu | Jul 25, 2025, 12:50:52 AM | #2829 | Reward |
| π’ @b4s36t4 | May 27, 2025, 05:33:28 AM | WIP | |
| π’ @ibishal | May 27, 2025, 06:16:11 AM | WIP | |
| π’ @Sahelisaha04 | Jul 28, 2025, 08:23:35 AM | #2836 | Reward |
@jjcampana I have assigned a public bounty to this request to hopefully speed up development. Great feature request and I'm a big fan of multi-modal use-cases.
/attempt #2202
hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.
Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?
/attempt #2202
hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.
Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?
This would most likely be backend and SDK change (i belive) but will have one of the product engineers reply here with more context. cc @Lothiraldan
Hi @b4s36t4, great to hear you're exploring this feature!
Supporting cost and usage tracking for OpenAIβs TTS API (audio.speech.with_streaming_response.create) will require changes across multiple parts of the codebase. Here's a breakdown of the areas to focus on and how things work today:
Integration Coverage
-
Current Limitation: Our OpenAI integration does not yet support the TTS streaming API (
audio.speech.with_streaming_response.create). -
Integration Location: The relevant code is in the OpenAI integration module here: opik/integrations/openai
-
Streaming Considerations: Because this is a streaming API, extra care must be taken to avoid performance degradation. The tracing should:
- Not introduce significant latency
- Avoid excessive memory usage
- Safely intercept/log the stream without disrupting the user experience
Logging Audio Streams
- Since the API returns an audio stream, it would be ideal to log this as an attachment, so users can play it back directly in the UI.
- See how to do this in the docs here: Logging Attachments in OPiK
Cost & Usage Tracking
- The TTS models (
tts-1,tts-1-hd) are billed by character count, not by token count. - We follow LiteLLM's pricing mappings: LiteLLM pricing source
To track cost and usage properly, here's what needs to happen:
SDK-Level (Python/TS) Work
- Log the prompt character count under
total_tokensin the trace metadata - Log the model name accurately (
tts-1,tts-1-hd) - Keeping the provider name consistent as
openaifor TTS spans would be ideal
Backend-Level Changes
-
Support new billing metric (characters instead of tokens) in: CostService.java
-
Implement cost logic specific to TTS in: SpanCostCalculator.java
Please let me know if you have more questions
@b4s36t4 @ibishal let me know if either of you need help
/attempt #2202
/attempt #2202
/attemp #2202
/attempt #2202
/attempt #2202
If you have mind, another open source TTS model as I have experience in TTS model.