opik [FR]: Support Openai TTS models tracking

Proposal summary

Currently, while using OpenAI TTS models, it’s not possible to track the cost and usage of the api using the Openai Integration, for example we are using audio.speech.with_streaming_response.create in order to perform the Text-To-Speech. Would you have this in the roadmap or would it be possible to add it? Thank you in advance

Motivation

Tracing voice models is not straightforward, and we have current limitations doing this with Openai Realtime and Text-To-Speech models

May 21 '25 11:05 jjcampana

/bounty $200

May 23 '25 09:05 vincentkoc

💎 $200 bounty • Comet

Steps to solve:

Read Contributing Docs: See contributing guide and read on how to setup Opik and contribute to various parts of the code base.
Start working: Comment /attempt #2202 with your implementation plan
Submit work: Create a pull request including /claim #2202 in the PR body to claim the bounty
Review: Team will review PR and any clarifying questions and if successful changes will be merged
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

❗ Important guidelines:

Do NOT start multiple bounties
To claim a bounty, you need to provide a short demo video of your changes in your pull request
If anything is unclear, ask for clarification before starting as this will help avoid potential rework
Low quality AI PRs will not receive review and will be closed
Please ask to be assigned before attempting to work on the bounty

Thank you for contributing to Comet!

Attempt	Started (UTC)	Solution	Actions
🟢 @vishalpatil1899	Jun 19, 2025, 09:47:56 PM	WIP
🟢 @Gmin2	Jun 23, 2025, 07:30:03 AM	#2547	Reward
🟢 @vladimirrotariu	Jul 25, 2025, 12:50:52 AM	#2829	Reward
🟢 @b4s36t4	May 27, 2025, 05:33:28 AM	WIP
🟢 @ibishal	May 27, 2025, 06:16:11 AM	WIP
🟢 @Sahelisaha04	Jul 28, 2025, 08:23:35 AM	#2836	Reward

May 23 '25 09:05 algora-pbc[bot]

@jjcampana I have assigned a public bounty to this request to hopefully speed up development. Great feature request and I'm a big fan of multi-modal use-cases.

May 23 '25 09:05 vincentkoc

/attempt #2202

May 27 '25 05:05 b4s36t4

hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.

Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?

May 27 '25 05:05 b4s36t4

/attempt #2202

May 27 '25 06:05 ibishal

hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.

Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?

This would most likely be backend and SDK change (i belive) but will have one of the product engineers reply here with more context. cc @Lothiraldan

May 27 '25 12:05 vincentkoc

Hi @b4s36t4, great to hear you're exploring this feature!

Supporting cost and usage tracking for OpenAI’s TTS API (audio.speech.with_streaming_response.create) will require changes across multiple parts of the codebase. Here's a breakdown of the areas to focus on and how things work today:

Integration Coverage

Current Limitation: Our OpenAI integration does not yet support the TTS streaming API (audio.speech.with_streaming_response.create).
Integration Location: The relevant code is in the OpenAI integration module here: opik/integrations/openai
Streaming Considerations: Because this is a streaming API, extra care must be taken to avoid performance degradation. The tracing should:
- Not introduce significant latency
- Avoid excessive memory usage
- Safely intercept/log the stream without disrupting the user experience

Logging Audio Streams

Since the API returns an audio stream, it would be ideal to log this as an attachment, so users can play it back directly in the UI.
See how to do this in the docs here: Logging Attachments in OPiK

Cost & Usage Tracking

The TTS models (tts-1, tts-1-hd) are billed by character count, not by token count.
We follow LiteLLM's pricing mappings: LiteLLM pricing source

To track cost and usage properly, here's what needs to happen:

SDK-Level (Python/TS) Work

Log the prompt character count under total_tokens in the trace metadata
Log the model name accurately (tts-1, tts-1-hd)
Keeping the provider name consistent as openai for TTS spans would be ideal

Backend-Level Changes

Support new billing metric (characters instead of tokens) in: CostService.java
Implement cost logic specific to TTS in: SpanCostCalculator.java

Please let me know if you have more questions

May 28 '25 15:05 Lothiraldan

@b4s36t4 @ibishal let me know if either of you need help

May 29 '25 11:05 vincentkoc

/attempt #2202

Jun 19 '25 21:06 vishalpatil1899

/attempt #2202

Jun 25 '25 16:06 Gmin2

/attemp #2202

Jun 30 '25 09:06 MAVRICK-1

/attempt #2202

Jul 25 '25 00:07 vladimirrotariu

/attempt #2202

Jul 28 '25 08:07 Sahelisaha04

If you have mind, another open source TTS model as I have experience in TTS model.

Dec 05 '25 12:12 kb-assert-ai