issues
issues copied to clipboard
2024.5.0 Voice assistant, no speaker sounds
The problem
I am using the ESP32-S3-BOX (non 3) firmware from esphome/firmware. However, after updating to esphome 2024.5 I get no voice return. The text on the display does come up correctly and opening the audio link in a browser plays the audio as normal.
Which version of ESPHome has the issue?
2024.5.0
What type of installation are you using?
Home Assistant Add-on
Which version of Home Assistant has the issue?
2024.5
What platform are you using?
ESP32
Board
No response
Component causing the issue
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
[11:05:18][D][voice_assistant:591]: Speech recognised as: "Tell me a joke."
[11:05:18][D][text_sensor:064]: 'text_request': Sending state 'Tell me a joke.'
[11:05:18][W][component:237]: Component voice_assistant took a long time for an operation (240 ms).
[11:05:18][W][component:238]: Components should block for at most 30 ms.
[11:05:18][D][voice_assistant:563]: Event Type: 5
[11:05:18][D][voice_assistant:596]: Intent started
[11:05:19][D][voice_assistant:563]: Event Type: 6
[11:05:19][D][voice_assistant:563]: Event Type: 7
[11:05:19][D][voice_assistant:619]: Response: "I'm here to assist with your smart home. How can I help you today?"
[11:05:19][D][text_sensor:064]: 'text_response': Sending state 'I'm here to assist with your smart home. How can I help you today?'
[11:05:19][D][voice_assistant:563]: Event Type: 98
[11:05:19][D][voice_assistant:704]: TTS stream start
[11:05:19][D][esp-idf:000][speaker_task]: I (258604) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8
[11:05:19][D][esp-idf:000][speaker_task]: I (258612) I2S: I2S0, MCLK output by GPIO2
[11:05:19][D][esp-idf:000][speaker_task]: I (258618) ESP32_S3_BOX: I2S0, MCLK output by GPIO0
[11:05:19][D][esp-idf:000][speaker_task]: I (258622) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438
[11:05:19][D][esp-idf:000][speaker_task]: I (258629) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created
[11:05:19][D][esp-idf:000][speaker_task]: I (258635) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created
[11:05:19][D][esp-idf:000][speaker_task]: I (258640) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064151 Bytes, Inter:63740 Bytes, Dram:63740 Bytes
[11:05:19][D][esp-idf:000][i2s]: I (258646) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1
[11:05:19][D][esp-idf:000][i2s]: I (258648) I2S_STREAM: AUDIO_STREAM_WRITER
[11:05:19][D][esp-idf:000][speaker_task]: I (258652) AUDIO_PIPELINE: Pipeline started
[11:05:20][W][component:237]: Component voice_assistant took a long time for an operation (280 ms).
[11:05:20][W][component:238]: Components should block for at most 30 ms.
[11:05:20][D][voice_assistant:563]: Event Type: 8
[11:05:20][D][voice_assistant:639]: Response URL: "http://192.168.1.102:8123/api/tts_proxy/8e80ff9caa1ef21e0bcaaea38ac66211b3483bab_en-gb_2cdeae300d_tts.microsoft.wav"
[11:05:20][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:563]: Event Type: 2
[11:05:20][D][voice_assistant:653]: Assist Pipeline ended
[11:05:21][D][esp-idf:000][speaker_task]: W (260212) AUDIO_PIPELINE: There are no listener registered
[11:05:21][D][esp-idf:000][speaker_task]: I (260219) AUDIO_PIPELINE: audio_pipeline_unlinked
[11:05:21][D][esp-idf:000][speaker_task]: W (260226) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: I (260235) I2S: DMA queue destroyed
[11:05:21][D][esp-idf:000][speaker_task]: W (260243) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: W (260251) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:21][D][esp-idf:000][speaker_task]: I (260291) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8
[11:05:21][D][esp-idf:000][speaker_task]: I (260299) I2S: I2S0, MCLK output by GPIO2
[11:05:21][D][esp-idf:000][speaker_task]: I (260309) ESP32_S3_BOX: I2S0, MCLK output by GPIO0
[11:05:21][D][esp-idf:000][speaker_task]: I (260317) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438
[11:05:21][D][esp-idf:000][speaker_task]: I (260325) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created
[11:05:21][D][esp-idf:000][speaker_task]: I (260333) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created
[11:05:21][D][esp-idf:000][speaker_task]: I (260338) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064243 Bytes, Inter:63832 Bytes, Dram:63832 Bytes
[11:05:21][D][esp-idf:000][i2s]: I (260345) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1
[11:05:21][D][esp-idf:000][i2s]: I (260348) I2S_STREAM: AUDIO_STREAM_WRITER
[11:05:21][D][esp-idf:000][speaker_task]: I (260350) AUDIO_PIPELINE: Pipeline started
[11:05:25][D][voice_assistant:563]: Event Type: 99
[11:05:25][D][voice_assistant:712]: TTS stream end
[11:05:25][D][voice_assistant:310]: End of audio stream received
[11:05:25][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[11:05:25][D][voice_assistant:445]: Desired state set to RESPONSE_FINISHED
[11:05:27][D][esp-idf:000][speaker_task]: W (266700) AUDIO_PIPELINE: There are no listener registered
[11:05:27][D][esp-idf:000][speaker_task]: I (266707) AUDIO_PIPELINE: audio_pipeline_unlinked
[11:05:27][D][esp-idf:000][speaker_task]: W (266716) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE
[11:05:27][D][esp-idf:000][speaker_task]: I (266723) I2S: DMA queue destroyed
Additional information
No response
Some one might want to merge all the related issues into one, or invent a search function ;-)
I have the same issue on a box 3 device after updating to 2024.5.0 . The wav file in the log has the right content and the right content is displayed on the display but there is no audio except for popping sounds. How can I rollback to a previous firmware version to confirm this is the cause?
If you have a back up of the older ESPHome add-on you can roll it back and then reinstall the firmware on your Box. I unfortunately did not back up my ESPHome instance :cry:
Some one might want to merge all the related issues into one, or invent a search function ;-)
The issue is different though... Before there was actually audio, now there isn't.
I have also have the same issue on a box 3 device after updating to 2024.5.0 .
I first thought this was an issue with music assistant, however I do get output, just very high-pitched, so maybe there is a sampling/bitrate issue at the root of this? (My hw = onju voice)
Also have the issue with ESP32-S3-BOX-3 wth latest ESPHome 2024.5.0. Even completed a fresh install (base voice assistant install) on one of the ESP32-S3-BOX-3 and still did not have any audio.
Audio does work with the M5Stack Atom Echoes.
Audio does work with the M5Stack Atom Echoes.
Hi, It doesn't work for me with an Atom echo. After the voice response (no sound), the atom echo reboot. I downgraded to version 2024.4.2 and it works.
Where is the downgrade procedure documented? I see the same issue.
Where is the downgrade procedure documented? I see the same issue.
Assuming you are running it as a home-assistant addon, you should have a backup from before the upgrade (Home Assistant offers to make these by default), you have to restore that backup. If you have a full backup, you can do a partial restore and only restore the ESPHome addon.
Where is the downgrade procedure documented? I see the same issue.
Assuming you are running it as a home-assistant addon, you should have a backup from before the upgrade (Home Assistant offers to make these by default), you have to restore that backup. If you have a full backup, you can do a partial restore and only restore the ESPHome addon.
I just installed ESP Home as I just got a ESP32 S3 Box 3 so when I installed ESPHome it took me to 2024.5.0 so I had no way to go back.
I solved by going back using https://github.com/khenderick/esphome-legacy-addons/blob/main/README.md and now have audio responses working.
The tip at the end of this issue description fixed audio playback for me. Essentially, pinning the esp-idf version to version: 4.4.6, implying that this caused the regression.
confirming esp-idf version change as a condition to the bug. adding
esp32:
framework:
type: esp-idf
version: 4.4.6
to the yaml resolves the issue.
confirming esp-idf version change as a condition to the bug. adding
esp32: framework: type: esp-idf version: 4.4.6to the yaml resolves the issue.
For the new guys like me, which yaml file gets this?
confirming esp-idf version change as a condition to the bug. adding
esp32: framework: type: esp-idf version: 4.4.6to the yaml resolves the issue.
For the new guys like me, which yaml file gets this?
The yaml you're uploading to the device, click Edit on the device in esphome.
confirming also esp-idf version change as a condition to the bug. adding
esp32: framework: type: esp-idf version: 4.4.6 to the yaml resolves the issue.
I find the same after updating to esphome 2024.5.1 as I did with 2024.5.0.
version: recommended occasionally plays the first sound but then nothing more, version: 4.4.6 plays everything OK.
Just wanted to chime in that I'm also having the No sound issues on both my S3-Box3 devices. They worked fine, albeit way to quite, until the recent updates.
Experienced the same issue with the chirp sound playing but no Text to Speech response from HA on a brand new S3 implementation, I had installed the ESPHOME update - As suggested added this to the YAML in ESPHome, Saved and Installed to device and I now have voice response:
esp32: framework: type: esp-idf version: 4.4.6
same issue on box 3 (ESP32-S3-BOX-3) HA Core 2024.5.4 ESPHome 2024.5.2 no audio output, text only appears partially, and takes too long listening
also confirm this solution gets the voice working:
confirming esp-idf version change as a condition to the bug. adding
esp32: framework: type: esp-idf version: 4.4.6to the yaml resolves the issue.
I just got the esp32-s3-box-3, after installing the ESPhome software it would respond to voice, do the command, but only text would appear. I did that roleback using esp32 version 4.4.6, now it responds with voice. It does take a little time to respond, and I am using Whisper and Piper for audio.
Also sometimes it freezes after I give a command, it does the command, responds on screen, then freezes before any audio. It will stay frozen until unplugged or using the reboot button.
I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio.
I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc.
The microphone however has seemed to work fine, regardless of version!
Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!
I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio.
I've tried downgrading esphome to
2024.4.2and esp-idf to4.4.6but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc.The microphone however has seemed to work fine, regardless of version!
Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!
Is the factory demo firmware capable of playing audio for you?
I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to
2024.4.2and esp-idf to4.4.6but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!Is the factory demo firmware capable of playing audio for you?
Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!
I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to
2024.4.2and esp-idf to4.4.6but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!Is the factory demo firmware capable of playing audio for you?
Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!
Can you show the config you were flashing while specifying esp-idf to version 4.4.6?
I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to
2024.4.2and esp-idf to4.4.6but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!Is the factory demo firmware capable of playing audio for you?
Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!
Can you show the config you were flashing while specifying esp-idf to version 4.4.6?
I tried it both as copying the full yaml and also as a package like this, neither work
substitutions:
name: esp32-s3-box-3-05a96c
friendly_name: Office Assistant
packages:
esphome.voice-assistant: github://esphome/firmware/wake-word-voice-assistant/esp32-s3-box-3.yaml@main
esphome:
name: ${name}
name_add_mac_suffix: false
friendly_name: ${friendly_name}
api:
encryption:
key: ***
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
manual_ip:
static_ip: 192.168.68.26
gateway: 192.168.68.1
subnet: 255.255.255.0
esp32:
framework:
type: esp-idf
version: 4.4.6
Interestingly when I flashed from the ESPHome Projects the sound worked!
But when adopting in to ESPHome it forces a recompile, at that point it then broke. So whatever version has been built on the projects site works.
EDIT: I tried the above yaml again and it now works. Maybe I screwed something up before? Anyway all good now on the latest esphome and esp-idf 4.4.6
Downgrading to 4.4.6 fixed it for me as well, would be good to get this updated to support 5.2.x (5.2.2 is current).
Upgrading to ESPHome 2024.5.5 fixed the issue for me (before upgrading I removed the config portion that specified the esp-idf version).
Can also confirm that ESPHome 2024.5.5 fixes the issue! All working perfectly now :)
Do you still have to use esp32 framework version 4.4.6 after updating ESPHome to 2024.5.5 ? I set mine back to recommended and I still have no sound.
esp32:
board: esp32-s3-devkitc-1
flash_size: 16MB
framework:
type: esp-idf
version: recommended
sdkconfig_options:
CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
CONFIG_AUDIO_BOARD_CUSTOM: "y"