opik icon indicating copy to clipboard operation
opik copied to clipboard

[FR]: Support Openai TTS models tracking

Open jjcampana opened this issue 7 months ago β€’ 9 comments

Proposal summary

Currently, while using OpenAI TTS models, it’s not possible to track the cost and usage of the api using the Openai Integration, for example we are using audio.speech.with_streaming_response.create in order to perform the Text-To-Speech. Would you have this in the roadmap or would it be possible to add it? Thank you in advance

Motivation

Tracing voice models is not straightforward, and we have current limitations doing this with Openai Realtime and Text-To-Speech models

jjcampana avatar May 21 '25 11:05 jjcampana

/bounty $200

vincentkoc avatar May 23 '25 09:05 vincentkoc

πŸ’Ž $200 bounty β€’ Comet

Steps to solve:

  1. Read Contributing Docs: See contributing guide and read on how to setup Opik and contribute to various parts of the code base.
  2. Start working: Comment /attempt #2202 with your implementation plan
  3. Submit work: Create a pull request including /claim #2202 in the PR body to claim the bounty
  4. Review: Team will review PR and any clarifying questions and if successful changes will be merged
  5. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

❗ Important guidelines:

  • Do NOT start multiple bounties
  • To claim a bounty, you need to provide a short demo video of your changes in your pull request
  • If anything is unclear, ask for clarification before starting as this will help avoid potential rework
  • Low quality AI PRs will not receive review and will be closed
  • Please ask to be assigned before attempting to work on the bounty

Thank you for contributing to Comet!

Attempt Started (UTC) Solution Actions
🟒 @vishalpatil1899 Jun 19, 2025, 09:47:56 PM WIP
🟒 @Gmin2 Jun 23, 2025, 07:30:03 AM #2547 Reward
🟒 @vladimirrotariu Jul 25, 2025, 12:50:52 AM #2829 Reward
🟒 @b4s36t4 May 27, 2025, 05:33:28 AM WIP
🟒 @ibishal May 27, 2025, 06:16:11 AM WIP
🟒 @Sahelisaha04 Jul 28, 2025, 08:23:35 AM #2836 Reward

algora-pbc[bot] avatar May 23 '25 09:05 algora-pbc[bot]

@jjcampana I have assigned a public bounty to this request to hopefully speed up development. Great feature request and I'm a big fan of multi-modal use-cases.

vincentkoc avatar May 23 '25 09:05 vincentkoc

/attempt #2202

b4s36t4 avatar May 27 '25 05:05 b4s36t4

hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.

Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?

b4s36t4 avatar May 27 '25 05:05 b4s36t4

/attempt #2202

ibishal avatar May 27 '25 06:05 ibishal

hey, @vincentkoc. I'm trying to tackle the issue, Can you point me in the right direction how currently pricing is handled.

Also, the solution I'm thinking is a SDK level support i.e Python/Typescript not a backend level solution would that be ok?

This would most likely be backend and SDK change (i belive) but will have one of the product engineers reply here with more context. cc @Lothiraldan

vincentkoc avatar May 27 '25 12:05 vincentkoc

Hi @b4s36t4, great to hear you're exploring this feature!

Supporting cost and usage tracking for OpenAI’s TTS API (audio.speech.with_streaming_response.create) will require changes across multiple parts of the codebase. Here's a breakdown of the areas to focus on and how things work today:


Integration Coverage

  • Current Limitation: Our OpenAI integration does not yet support the TTS streaming API (audio.speech.with_streaming_response.create).

  • Integration Location: The relevant code is in the OpenAI integration module here: opik/integrations/openai

  • Streaming Considerations: Because this is a streaming API, extra care must be taken to avoid performance degradation. The tracing should:

    • Not introduce significant latency
    • Avoid excessive memory usage
    • Safely intercept/log the stream without disrupting the user experience

Logging Audio Streams

  • Since the API returns an audio stream, it would be ideal to log this as an attachment, so users can play it back directly in the UI.
  • See how to do this in the docs here: Logging Attachments in OPiK

Cost & Usage Tracking

  • The TTS models (tts-1, tts-1-hd) are billed by character count, not by token count.
  • We follow LiteLLM's pricing mappings: LiteLLM pricing source

To track cost and usage properly, here's what needs to happen:

SDK-Level (Python/TS) Work

  • Log the prompt character count under total_tokens in the trace metadata
  • Log the model name accurately (tts-1, tts-1-hd)
  • Keeping the provider name consistent as openai for TTS spans would be ideal

Backend-Level Changes

  1. Support new billing metric (characters instead of tokens) in: CostService.java

  2. Implement cost logic specific to TTS in: SpanCostCalculator.java

Please let me know if you have more questions

Lothiraldan avatar May 28 '25 15:05 Lothiraldan

@b4s36t4 @ibishal let me know if either of you need help

vincentkoc avatar May 29 '25 11:05 vincentkoc

/attempt #2202

vishalpatil1899 avatar Jun 19 '25 21:06 vishalpatil1899

/attempt #2202

Gmin2 avatar Jun 25 '25 16:06 Gmin2

/attemp #2202

MAVRICK-1 avatar Jun 30 '25 09:06 MAVRICK-1

/attempt #2202

vladimirrotariu avatar Jul 25 '25 00:07 vladimirrotariu

/attempt #2202

Sahelisaha04 avatar Jul 28 '25 08:07 Sahelisaha04

If you have mind, another open source TTS model as I have experience in TTS model.

kb-assert-ai avatar Dec 05 '25 12:12 kb-assert-ai