`TypeError` When Using OpenAI STT: Concatenating `NoneType` in `_on_final_transcript` Method

Open LaiLaK918 opened this issue 1 year ago • 0 comments

Description

I'm encountering a TypeError in the VoicePipelineAgent when using OpenAI's STT service to process the final transcript. It appears that the _transcribed_text attribute is being concatenated with a NoneType value, leading to the error.

Error Details

DEBUG livekit.agents - http_session(): creating a new httpclient ctx {"pid": 1897483, "job_id": "AJ_42T5zxW7EuMp"}
FATAL: exception not rethrown
ERROR livekit.agents - job process exited with non-zero exit code -6 {"pid": 1897483, "job_id": "AJ_42T5zxW7EuMp"}
ERROR livekit.agents.pipeline - Error in _recognize_task
Traceback (most recent call last):
  File "path/to/livekit/agents/utils/log.py", line 16, in async_fn_logs
    return await fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/to/livekit/agents/pipeline/human_input.py", line 152, in _recognize_task
    await asyncio.gather(*tasks)
  File "path/to/livekit/agents/pipeline/human_input.py", line 142, in _stt_stream_co
    self.emit("final_transcript", ev)
  File "path/to/livekit/agents/utils/event_emitter.py", line 14, in emit
    callback(*args, **kwargs)
  File "path/to/livekit/agents/pipeline/pipeline_agent.py", line 434, in _on_final_transcript
    self._transcribed_text += (
                              ^
TypeError: can only concatenate str (not "NoneType") to str {"pid": 1897829, "job_id": "AJ_fMuwwQeWe3Sw"}

Code Snippet

from livekit.agents.voice_assistant import VoicePipelineAgent
from livekit.plugins import deepgram, openai, silero

async def entrypoint(ctx: JobContext):
    # ...
    assistant = VoicePipelineAgent(
        vad=ctx.proc.userdata["vad"],
        # stt=deepgram.STT(api_key=DEEPGRAM_API_KEY),
        stt=openai.STT(api_key=OPENAI_API_KEY),
        llm=openai.LLM(api_key=OPENAI_API_KEY, model=OPENAI_LLM_MODEL),
        tts=openai.TTS(api_key=OPENAI_API_KEY, model=OPENAI_TTS_MODEL),
        fnc_ctx=fnc_ctx,
        chat_ctx=initial_chat_ctx,
    )

Environment

Operating System: [Ubuntu 22.04]
Python Version: 3.11
LiveKit Versions:
- livekit==0.17.2
- livekit-agents==0.10.0
- livekit-api==0.7.1
- livekit-plugins-deepgram==0.6.7
- livekit-plugins-openai==0.10.1
- livekit-plugins-silero==0.7.1
- livekit-protocol==0.6.x

Steps to Reproduce

Set up the VoicePipelineAgent with the provided configuration, using OpenAI's STT.
Connect to the agent using the LiveKit Agent Playground.
Interact with the agent by speaking to trigger the speech-to-text (STT) process.
Monitor the logs for the TypeError when the final transcript is emitted.

Oct 07 '24 02:10 LaiLaK918