cookbook icon indicating copy to clipboard operation
cookbook copied to clipboard

Audio Output Issues in Multimodal Live API

Open abdul7235 opened this issue 11 months ago • 0 comments

Description of the bug:

I am using the following code:

https://github.com/google-gemini/cookbook/blob/main/gemini-2/live_api_starter.py

This code was working fine 4 days ago, but today I’m encountering the following issues in the output:

  1. Words are frequently cut off or missing.
  2. Sentences and words are jumbled, making the audio difficult to understand.
  3. There is significant repetition of words and sentences.

I am attaching a file where I asked Gemini to provide information about Manchester United. The attached link contains the audio file I get in the streamed output.

https://drive.google.com/file/d/1ejDX9wlxt6wd0qM3-zDLTGh6g0GAAryv/view?usp=sharing

Actual vs expected behavior:

Expected Behavior: The audio output should be smooth, clear, and easily understandable.

Actual Behavior: While the audio output was fine 4 days ago, the following discrepancies were observed today:

  1. Words are frequently cut off or missing.
  2. Sentences and words are jumbled.
  3. Significant repetition of words and sentences reduces clarity and usability.

Any other information you'd like to share?

No response

abdul7235 avatar Dec 30 '24 17:12 abdul7235