cookbook icon indicating copy to clipboard operation
cookbook copied to clipboard

Issue changing default voice in Gemini2 live api

Open notnotrishi opened this issue 1 year ago • 5 comments

Description of the bug:

I'm using the code in live_api_starter.py to test out the multimodal live api and build from there. The basic code works. Based on the documentation in https://ai.google.dev/api/multimodal-live#sessions I am attempting to change the default voice but it seems to be not working if I use the format mentioned in the API documentation:

My code snippet:

MODEL = "models/gemini-2.0-flash-exp"

MODE = args.mode

client = genai.Client(http_options={"api_version": "v1alpha"})

CONFIG = {
    "generation_config": {
        "response_modalities": ["AUDIO"],
        "speech_config": {
            "voice_config": {
                "prebuilt_voice_config": {
                    "voice_name": "Charon"
                }
            }
        },
    }
}

Error message:

Traceback (most recent call last): File "/Users/rishi/Projects/gemini_live/new.py", line 393, in asyncio.run(main.run()) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 190, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "/Users/rishi/Projects/gemini_live/new.py", line 362, in run async with ( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 204, in aenter return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/live.py", line 626, in connect self._LiveSetup_to_mldev(model=transformed_model, config=config) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/live.py", line 459, in _LiveSetup_to_mldev _GenerateContentConfig_to_mldev( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/models.py", line 894, in _GenerateContentConfig_to_mldev t.t_speech_config(api_client, getv(from_object, ['speech_config'])), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/google/genai/_transformers.py", line 301, in t_speech_config raise ValueError(f'Unsupported speechConfig type: {type(origin)}') ValueError: Unsupported speechConfig type: <class 'dict'>

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

notnotrishi avatar Dec 20 '24 01:12 notnotrishi

I had the same issue. Try this instead. It worked for me.

"generation_config": { "response_modalities": ["AUDIO"], "speech_config": "Charon" }

ThashilNaidoo avatar Dec 20 '24 16:12 ThashilNaidoo

, "speech_config": "Charon"

I had the same issue. Try this instead. It worked for me.

"generation_config": { "response_modalities": ["AUDIO"], "speech_config": "Charon" }

that works, thanks @ThashilNaidoo !

but i also noticed the report is triaged as a bug so hopefully they can fix the issue and/or clarify which one is the correct usage

notnotrishi avatar Dec 20 '24 19:12 notnotrishi

Hi, thanks for reporting this. I just looked into it and this is fixed in the latest release (0.4).

Both methods work now.

MarkDaoust avatar Jan 09 '25 00:01 MarkDaoust

Hi, thanks for reporting this. I just looked into it and this is fixed in the latest release (0.4).

Both methods work now.

I tried both and neither are working for me

Invalid JSON payload received. Unknown name "prebuilt_voice_config " at 'setup.generati; then sent 1007 (invalid frame payload data) Request trace id: 74133fd7e2c07425, Invalid JSON payload received. Unknown name "prebuilt_voice_config " at 'setup.generati

conn = await es.enter_async_context( connect( f'wss://{HOST}/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateContent?key={API_KEY}') ) print('')

    initial_request = {
        'setup': {
            'model': MODEL,
            'system_instruction': {
                "parts": [
                    {
                        "text": SYSTEM_MESSAGE
                    }
                ]
        },
            "tools":  {'function_declarations': [pay_bill_tool, get_quote_tool]},
            "generation_config": {
                "response_modalities": ["AUDIO"],
                "speech_config": {
                      "voice_config": {
                        "prebuilt_voice_config ": {
                          "voice_name": "Puck"
                        }
                      }
                }
            }
        },
    }

brandonwheat avatar Jan 10 '25 16:01 brandonwheat

Let me try this

rubiagatra avatar Mar 18 '25 02:03 rubiagatra

Hi, I just checked on my end and it is working as expected. Please refer to the documentation, and let me know if you are still experiencing the issue.

Thanks

Gunand3043 avatar Jul 25 '25 07:07 Gunand3043

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

github-actions[bot] avatar Aug 08 '25 22:08 github-actions[bot]

This issue was closed because it has been inactive for 27 days. Please post a new issue if you need further assistance. Thanks!

github-actions[bot] avatar Aug 22 '25 22:08 github-actions[bot]