deepgram-python-sdk
deepgram-python-sdk copied to clipboard
UtteranceEnd never triggers
What is the current behavior?
The UtteranceEnd event does not come 1000ms after the last spoken word.
Steps to reproduce
options: LiveOptions = LiveOptions(
model="nova-2",
language="en",
# Apply smart formatting to the output
smart_format=True,
# Raw audio format details
encoding="mulaw",
channels=1,
sample_rate=16000,
# To get UtteranceEnd, the following must be set:
interim_results=True,
utterance_end_ms="1000",
vad_events=True,
# Time in milliseconds of silence to wait for before finalizing speech
endpointing=300
)
@sock.route('/echo')
def echo(ws):
try:
# STEP 1: Create a Deepgram client using the API key
config = DeepgramClientOptions(
options={"keepalive": "true"} # Comment this out to see the effect of not using keepalive
)
deepgram = DeepgramClient("", config)
# STEP 2: Create a websocket connection to Deepgram
dg_connection = deepgram.listen.live.v("1")
# STEP 3: Define the event handlers for the connection
def on_message(self, result, **kwargs):
global is_finals
print(result.type)
if result.is_final:
sentence = result.channel.alternatives[0].transcript
is_finals.append(sentence)
if result.speech_final:
utterance = ' '.join(is_finals)
print(f"Speech final: {utterance}")
is_finals = []
def on_metadata(self, metadata, **kwargs):
print(f"\n\n{metadata}\n\n")
def on_error(self, error, **kwargs):
print(f"\n\n{error}\n\n")
dg_connection.on(LiveTranscriptionEvents.Transcript, on_message)
dg_connection.on(LiveTranscriptionEvents.Metadata, on_metadata)
dg_connection.on(LiveTranscriptionEvents.Error, on_error)
dg_connection.start(options)
while True:
data = ws.receive()
if data:
dg_connection.send(base64.b64decode(data))
# ws.send(data)
except Exception as e:
print(f"Error: {e}")
# dg_connection.stop()
# ws.close()
play music in the background and speak, print(resut.type)
will only print Results, but not the UtteranceEnd
event after I finish speaking, I have to stop music for speech_final
to be triggered.
Expected behavior
I would expect UtteranceEnd to trigger a second after my last word so I can finalize the sentence.
Please tell us about your environment
Local Flask server on a mac m2