issues icon indicating copy to clipboard operation
issues copied to clipboard

2024.5.0 Voice assistant, no speaker sounds

Open hugobloem opened this issue 1 year ago • 44 comments
trafficstars

The problem

I am using the ESP32-S3-BOX (non 3) firmware from esphome/firmware. However, after updating to esphome 2024.5 I get no voice return. The text on the display does come up correctly and opening the audio link in a browser plays the audio as normal.

Which version of ESPHome has the issue?

2024.5.0

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2024.5

What platform are you using?

ESP32

Board

No response

Component causing the issue

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

[11:05:18][D][voice_assistant:591]: Speech recognised as: "Tell me a joke."
[11:05:18][D][text_sensor:064]: 'text_request': Sending state 'Tell me a joke.'
[11:05:18][W][component:237]: Component voice_assistant took a long time for an operation (240 ms).
[11:05:18][W][component:238]: Components should block for at most 30 ms.
[11:05:18][D][voice_assistant:563]: Event Type: 5
[11:05:18][D][voice_assistant:596]: Intent started
[11:05:19][D][voice_assistant:563]: Event Type: 6
[11:05:19][D][voice_assistant:563]: Event Type: 7
[11:05:19][D][voice_assistant:619]: Response: "I'm here to assist with your smart home. How can I help you today?"
[11:05:19][D][text_sensor:064]: 'text_response': Sending state 'I'm here to assist with your smart home. How can I help you today?'
[11:05:19][D][voice_assistant:563]: Event Type: 98
[11:05:19][D][voice_assistant:704]: TTS stream start
[11:05:19][D][esp-idf:000][speaker_task]: I (258604) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8

[11:05:19][D][esp-idf:000][speaker_task]: I (258612) I2S: I2S0, MCLK output by GPIO2

[11:05:19][D][esp-idf:000][speaker_task]: I (258618) ESP32_S3_BOX: I2S0, MCLK output by GPIO0

[11:05:19][D][esp-idf:000][speaker_task]: I (258622) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438

[11:05:19][D][esp-idf:000][speaker_task]: I (258629) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created

[11:05:19][D][esp-idf:000][speaker_task]: I (258635) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created

[11:05:19][D][esp-idf:000][speaker_task]: I (258640) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064151 Bytes, Inter:63740 Bytes, Dram:63740 Bytes


[11:05:19][D][esp-idf:000][i2s]: I (258646) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[11:05:19][D][esp-idf:000][i2s]: I (258648) I2S_STREAM: AUDIO_STREAM_WRITER

[11:05:19][D][esp-idf:000][speaker_task]: I (258652) AUDIO_PIPELINE: Pipeline started

[11:05:20][W][component:237]: Component voice_assistant took a long time for an operation (280 ms).
[11:05:20][W][component:238]: Components should block for at most 30 ms.
[11:05:20][D][voice_assistant:563]: Event Type: 8
[11:05:20][D][voice_assistant:639]: Response URL: "http://192.168.1.102:8123/api/tts_proxy/8e80ff9caa1ef21e0bcaaea38ac66211b3483bab_en-gb_2cdeae300d_tts.microsoft.wav"
[11:05:20][D][voice_assistant:439]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:445]: Desired state set to STREAMING_RESPONSE
[11:05:20][D][voice_assistant:563]: Event Type: 2
[11:05:20][D][voice_assistant:653]: Assist Pipeline ended
[11:05:21][D][esp-idf:000][speaker_task]: W (260212) AUDIO_PIPELINE: There are no listener registered

[11:05:21][D][esp-idf:000][speaker_task]: I (260219) AUDIO_PIPELINE: audio_pipeline_unlinked

[11:05:21][D][esp-idf:000][speaker_task]: W (260226) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[11:05:21][D][esp-idf:000][speaker_task]: I (260235) I2S: DMA queue destroyed

[11:05:21][D][esp-idf:000][speaker_task]: W (260243) AUDIO_ELEMENT: [filter] Element has not create when AUDIO_ELEMENT_TERMINATE

[11:05:21][D][esp-idf:000][speaker_task]: W (260251) AUDIO_ELEMENT: [raw] Element has not create when AUDIO_ELEMENT_TERMINATE

[11:05:21][D][esp-idf:000][speaker_task]: I (260291) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=8

[11:05:21][D][esp-idf:000][speaker_task]: I (260299) I2S: I2S0, MCLK output by GPIO2

[11:05:21][D][esp-idf:000][speaker_task]: I (260309) ESP32_S3_BOX: I2S0, MCLK output by GPIO0

[11:05:21][D][esp-idf:000][speaker_task]: I (260317) AUDIO_PIPELINE: link el->rb, el:0x3d85c2c8, tag:raw, rb:0x3d85c438

[11:05:21][D][esp-idf:000][speaker_task]: I (260325) AUDIO_ELEMENT: [raw-0x3d85c2c8] Element task created

[11:05:21][D][esp-idf:000][speaker_task]: I (260333) AUDIO_ELEMENT: [i2s-0x3d85c024] Element task created

[11:05:21][D][esp-idf:000][speaker_task]: I (260338) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8064243 Bytes, Inter:63832 Bytes, Dram:63832 Bytes


[11:05:21][D][esp-idf:000][i2s]: I (260345) AUDIO_ELEMENT: [i2s] AEL_MSG_CMD_RESUME,state:1

[11:05:21][D][esp-idf:000][i2s]: I (260348) I2S_STREAM: AUDIO_STREAM_WRITER

[11:05:21][D][esp-idf:000][speaker_task]: I (260350) AUDIO_PIPELINE: Pipeline started

[11:05:25][D][voice_assistant:563]: Event Type: 99
[11:05:25][D][voice_assistant:712]: TTS stream end
[11:05:25][D][voice_assistant:310]: End of audio stream received
[11:05:25][D][voice_assistant:439]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED
[11:05:25][D][voice_assistant:445]: Desired state set to RESPONSE_FINISHED
[11:05:27][D][esp-idf:000][speaker_task]: W (266700) AUDIO_PIPELINE: There are no listener registered

[11:05:27][D][esp-idf:000][speaker_task]: I (266707) AUDIO_PIPELINE: audio_pipeline_unlinked

[11:05:27][D][esp-idf:000][speaker_task]: W (266716) AUDIO_ELEMENT: [i2s] Element has not create when AUDIO_ELEMENT_TERMINATE

[11:05:27][D][esp-idf:000][speaker_task]: I (266723) I2S: DMA queue destroyed

Additional information

No response

hugobloem avatar May 15 '24 10:05 hugobloem

Some one might want to merge all the related issues into one, or invent a search function ;-)

Findarato avatar May 16 '24 13:05 Findarato

I have the same issue on a box 3 device after updating to 2024.5.0 . The wav file in the log has the right content and the right content is displayed on the display but there is no audio except for popping sounds. How can I rollback to a previous firmware version to confirm this is the cause?

ayavilevich avatar May 16 '24 13:05 ayavilevich

If you have a back up of the older ESPHome add-on you can roll it back and then reinstall the firmware on your Box. I unfortunately did not back up my ESPHome instance :cry:

hugobloem avatar May 16 '24 14:05 hugobloem

Some one might want to merge all the related issues into one, or invent a search function ;-)

The issue is different though... Before there was actually audio, now there isn't.

hugobloem avatar May 16 '24 14:05 hugobloem

I have also have the same issue on a box 3 device after updating to 2024.5.0 .

Pat0856 avatar May 16 '24 16:05 Pat0856

I first thought this was an issue with music assistant, however I do get output, just very high-pitched, so maybe there is a sampling/bitrate issue at the root of this? (My hw = onju voice)

tbrasser avatar May 16 '24 21:05 tbrasser

Also have the issue with ESP32-S3-BOX-3 wth latest ESPHome 2024.5.0. Even completed a fresh install (base voice assistant install) on one of the ESP32-S3-BOX-3 and still did not have any audio.

Audio does work with the M5Stack Atom Echoes.

smcnaught1 avatar May 17 '24 00:05 smcnaught1

Audio does work with the M5Stack Atom Echoes.

Hi, It doesn't work for me with an Atom echo. After the voice response (no sound), the atom echo reboot. I downgraded to version 2024.4.2 and it works.

WarC0zes avatar May 17 '24 08:05 WarC0zes

Where is the downgrade procedure documented? I see the same issue.

almoney avatar May 17 '24 11:05 almoney

Where is the downgrade procedure documented? I see the same issue.

Assuming you are running it as a home-assistant addon, you should have a backup from before the upgrade (Home Assistant offers to make these by default), you have to restore that backup. If you have a full backup, you can do a partial restore and only restore the ESPHome addon.

cryptk avatar May 17 '24 15:05 cryptk

Where is the downgrade procedure documented? I see the same issue.

Assuming you are running it as a home-assistant addon, you should have a backup from before the upgrade (Home Assistant offers to make these by default), you have to restore that backup. If you have a full backup, you can do a partial restore and only restore the ESPHome addon.

I just installed ESP Home as I just got a ESP32 S3 Box 3 so when I installed ESPHome it took me to 2024.5.0 so I had no way to go back.

I solved by going back using https://github.com/khenderick/esphome-legacy-addons/blob/main/README.md and now have audio responses working.

almoney avatar May 17 '24 16:05 almoney

The tip at the end of this issue description fixed audio playback for me. Essentially, pinning the esp-idf version to version: 4.4.6, implying that this caused the regression.

rccoleman avatar May 18 '24 01:05 rccoleman

confirming esp-idf version change as a condition to the bug. adding

esp32:
  framework:
    type: esp-idf
    version: 4.4.6

to the yaml resolves the issue.

ayavilevich avatar May 18 '24 10:05 ayavilevich

confirming esp-idf version change as a condition to the bug. adding

esp32:
  framework:
    type: esp-idf
    version: 4.4.6

to the yaml resolves the issue.

For the new guys like me, which yaml file gets this?

almoney avatar May 18 '24 12:05 almoney

confirming esp-idf version change as a condition to the bug. adding

esp32:
  framework:
    type: esp-idf
    version: 4.4.6

to the yaml resolves the issue.

For the new guys like me, which yaml file gets this?

The yaml you're uploading to the device, click Edit on the device in esphome.

mad-tunes avatar May 18 '24 13:05 mad-tunes

confirming also esp-idf version change as a condition to the bug. adding

esp32: framework: type: esp-idf version: 4.4.6 to the yaml resolves the issue.

Pat0856 avatar May 18 '24 13:05 Pat0856

I find the same after updating to esphome 2024.5.1 as I did with 2024.5.0. version: recommended occasionally plays the first sound but then nothing more, version: 4.4.6 plays everything OK.

mad-tunes avatar May 20 '24 13:05 mad-tunes

Just wanted to chime in that I'm also having the No sound issues on both my S3-Box3 devices. They worked fine, albeit way to quite, until the recent updates.

cl0ud6uru avatar May 21 '24 18:05 cl0ud6uru

Experienced the same issue with the chirp sound playing but no Text to Speech response from HA on a brand new S3 implementation, I had installed the ESPHOME update - As suggested added this to the YAML in ESPHome, Saved and Installed to device and I now have voice response:

esp32: framework: type: esp-idf version: 4.4.6

hargcore avatar May 21 '24 18:05 hargcore

same issue on box 3 (ESP32-S3-BOX-3) HA Core 2024.5.4 ESPHome 2024.5.2 no audio output, text only appears partially, and takes too long listening

also confirm this solution gets the voice working:

confirming esp-idf version change as a condition to the bug. adding

esp32:
  framework:
    type: esp-idf
    version: 4.4.6

to the yaml resolves the issue.

enoquefcd avatar May 22 '24 11:05 enoquefcd

I just got the esp32-s3-box-3, after installing the ESPhome software it would respond to voice, do the command, but only text would appear. I did that roleback using esp32 version 4.4.6, now it responds with voice. It does take a little time to respond, and I am using Whisper and Piper for audio.

Also sometimes it freezes after I give a command, it does the command, responds on screen, then freezes before any audio. It will stay frozen until unplugged or using the reboot button.

styphonthal avatar Jun 02 '24 16:06 styphonthal

I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio.

I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc.

The microphone however has seemed to work fine, regardless of version!

Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!

adamlc avatar Jun 03 '24 14:06 adamlc

I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio.

I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc.

The microphone however has seemed to work fine, regardless of version!

Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!

Is the factory demo firmware capable of playing audio for you?

tannisroot avatar Jun 03 '24 15:06 tannisroot

I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!

Is the factory demo firmware capable of playing audio for you?

Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!

adamlc avatar Jun 03 '24 15:06 adamlc

I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!

Is the factory demo firmware capable of playing audio for you?

Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!

Can you show the config you were flashing while specifying esp-idf to version 4.4.6?

tannisroot avatar Jun 03 '24 15:06 tannisroot

I've bought a brand new esp32-s3-box-3 to give voice assistant a try. Whatever I've tried I can't get audio to work. I just get a clicking sound where there should be audio. I've tried downgrading esphome to 2024.4.2 and esp-idf to 4.4.6 but don't have any joy sadly. Tried all different combinations of just downgrading esphome, just downgrading esp-idf. Downgrading both etc. The microphone however has seemed to work fine, regardless of version! Not sure if theres a way I can test if my speaker is faulty or not. But it seems lots of people have issues in general!

Is the factory demo firmware capable of playing audio for you?

Just tried flashing the factory demo and the speaker is working perfectly there. So definitely something with esphome!

Can you show the config you were flashing while specifying esp-idf to version 4.4.6?

I tried it both as copying the full yaml and also as a package like this, neither work

substitutions:
  name: esp32-s3-box-3-05a96c
  friendly_name: Office Assistant
packages:
  esphome.voice-assistant: github://esphome/firmware/wake-word-voice-assistant/esp32-s3-box-3.yaml@main
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: ***
wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  manual_ip:
    static_ip: 192.168.68.26
    gateway: 192.168.68.1
    subnet: 255.255.255.0
esp32:
  framework:
    type: esp-idf
    version: 4.4.6

Interestingly when I flashed from the ESPHome Projects the sound worked!

But when adopting in to ESPHome it forces a recompile, at that point it then broke. So whatever version has been built on the projects site works.

EDIT: I tried the above yaml again and it now works. Maybe I screwed something up before? Anyway all good now on the latest esphome and esp-idf 4.4.6

adamlc avatar Jun 04 '24 07:06 adamlc

Downgrading to 4.4.6 fixed it for me as well, would be good to get this updated to support 5.2.x (5.2.2 is current).

sammcj avatar Jun 04 '24 22:06 sammcj

Upgrading to ESPHome 2024.5.5 fixed the issue for me (before upgrading I removed the config portion that specified the esp-idf version).

tannisroot avatar Jun 05 '24 10:06 tannisroot

Can also confirm that ESPHome 2024.5.5 fixes the issue! All working perfectly now :)

adamlc avatar Jun 05 '24 10:06 adamlc

Do you still have to use esp32 framework version 4.4.6 after updating ESPHome to 2024.5.5 ? I set mine back to recommended and I still have no sound.

esp32:
  board: esp32-s3-devkitc-1
  flash_size: 16MB
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"
      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"
      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"
      CONFIG_AUDIO_BOARD_CUSTOM: "y"

rechichidaniel avatar Jun 05 '24 11:06 rechichidaniel