agents icon indicating copy to clipboard operation
agents copied to clipboard

Feature Request: Utterance timestamps in the ChatContext or Transcript

Open zaheerabbas-prodigal opened this issue 5 months ago • 0 comments

I want to capture the word level or atleast utterance level timestamps for both the user and agent transcripts and store the timestamp details in the ChatContext. The usecase is to use the ChatContext as the transcript to run some post-processing. Example use cases- displaying the calls in UI, redaction, summarization etc.

Currently the transcription data the SDK exposes for the user transcript is just the text part of the transcript and the utterance or word level timestamps are NOT exposed at all from the SDK.

The agent's transcript however does not even have the timestamps even though elevenlabs and cartesia TTS support timestamps in their API.

Has anyone tried a way to get these timestamp data from the livekit-agents SDK?

I am happy to submit a PR to add this feature. Would the PR be merged if I added this feature?

zaheerabbas-prodigal avatar May 19 '25 08:05 zaheerabbas-prodigal