OpenAI icon indicating copy to clipboard operation
OpenAI copied to clipboard

Support for verbose_json for audio transcriptions

Open azzever opened this issue 10 months ago • 2 comments

What

This pull request addresses a bug in the current implementation of the verbose_json option for OpenAI's audio transcription API. It also extends the AudioTranscriptionResult format to support the additional parameters returned when verbose_json is enabled, such as word-level and segment-level timestamps.

Why

The verbose_json option in OpenAI's audio transcription API provides valuable information for synchronizing the audio with the transcription. By fixing the bug and extending the AudioTranscriptionResult format, developers can easily access and utilize the timestamp data and other parameters to create synchronized audio-transcription experiences.

Affected Areas

audio/transcription:

  • The AudioTranscriptionResult format has been extended with optional values to accommodate the additional data provided by the verbose_json option.
  • New parameter timestampGranularities was added to AudioTranscriptionQuery

azzever avatar Apr 24 '24 06:04 azzever

Thank you for adding this, very helpful. Hopefully it gets merged soon

joeldrotleff avatar May 24 '24 22:05 joeldrotleff