openai-python icon indicating copy to clipboard operation
openai-python copied to clipboard

How to initiate realtime transcription session?

Open olarcher opened this issue 8 months ago • 2 comments

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • [x] This is an issue with the Python library

Describe the bug

Support for realtime audio transcriptions was recently announced: https://platform.openai.com/docs/guides/speech-to-text#streaming-the-transcription-of-an-ongoing-audio-recording

I noticed that in the latest release of the python sdk, an AsyncTranscriptionSessions object has become available under client.beta.realtime.transcription_sessions. However, I am not sure how to use this to initiate a new realtime transcription session. Could someone please provide an example?

To Reproduce

Instantiate a new async client with

client = AsyncOpenAI(api_key=OPENAI_API_KEY)

Not sure what to do next to start a realtime transcription session.

Tried:

 async with client.beta.realtime.connect(model="gpt-4o-realtime-preview", extra_query={"intent": "transcription"} ) as conn:
        await client.beta.realtime.transcription_sessions.create(
            input_audio_transcription={
                "model": "gpt-4o-transcribe",
                "language": "de"
            }
        )

        async for message in conn:
                print(message)

Get error: ErrorEvent(error=Error(message='You must not provide a model parameter for transcription sessions.', type='invalid_request_error', code='invalid_model', event_id=None, param=None), event_id='event_BEbGnb8W18CQ9cEZPdORK', type='error')

Code snippets


OS

macOS

Python version

3.12

Library version

openai==1.68.2

olarcher avatar Mar 24 '25 12:03 olarcher

Same issue with me, have you resolved it ?

gagan2209 avatar Mar 26 '25 10:03 gagan2209

Not yet, falling back to using raw websocket connection for now...

olarcher avatar Mar 26 '25 14:03 olarcher

Met the same issue!

ANYMS-A avatar Jul 06 '25 12:07 ANYMS-A

This is a question for the underlying OpenAI API and not the SDK, so I'm going to go ahead and close this issue.

Would you mind reposting at community.openai.com?

RobertCraigie avatar Jul 07 '25 12:07 RobertCraigie

@RobertCraigie the docs I linked show how this can be achieved with the OpenAI (websocket realtime) API (basically, you have to add an intent query param as follows: wss://api.openai.com/v1/realtime?intent=transcription)

My question is how the same can be achieved with the python SDK. Could you reopen please?

olarcher avatar Jul 07 '25 13:07 olarcher