ESP32-audioI2S icon indicating copy to clipboard operation
ESP32-audioI2S copied to clipboard

ESP32 without PSRAM crashing after playing (SD) or silent (on 2nd streaming)

Open kaloprojects opened this issue 6 months ago • 31 comments

Hi @schreibfaul1 Wolle,

i found two issues in your current AUDIO.H, maybe a tiny detail/bug in code but with huge impact ;). Writing both here as they seem to be related (appear on non-PSRAM only). Background: I found it more by accident on my ESP32 Chat project when I tested my current code .. after weeks again with non-PSRAM ESP32. So it could be that the 'bug' exists maybe longer ago, not caused from any updates you did days or weeks ago (i don't know).

Fortunately I can reproduce the problems also on your demo sketch, this makes it much easier for you to reproduce yourself :blush: In Summary: 2 issues detected,:

Issue 1: ‘CRASH & reboot on connecttoFS() – appears on ESP32 without PSRAM (or alternatively on ESP with PSRAM but PSRAM disabled):

this code line plays audio from SD successfully audio.connecttoFS(SD, "/welcome.wav"); but when finished then the ESP crashes ALWAYS immediately with exactly this line after audio succesfully played in speaker:

info        Closing audio file "welcome.wav"
CORRUPT HEAP: Bad head at 0x3ffefac4. Expected 0xabba1234 got 0x00000000
assert failed: multi_heap_free multi_heap_poisoning.c:279 (head != NULL)

Backtrace: 0x400828c4:0x3ffb1ee0 0x … then rebooting itself

Issue 2: ‘ANY 2nd Audio no longer playing’ – also with ESP32 without PSRAM (or ESP with PSRAM but disabled):

// e.g. playing a Google audio event 1
audio.connecttospeech("Wenn die Hunde schlafen ist alles gut.", "de"); // Google TTS
while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done

// then playing a new event, e.g. Google again or Open AI, or radio streaming .. Result: no Audio any longer
audio.connecttohost("http://stream.antennethueringen.de/live/aac-64/stream.antennethueringen.de/"); // aac
while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // forever   

The 2nd audio is never played, the info progess, looking good until last last, then ending with:

...
info        The AACDecoder could not be initialized

It doesn’t matter which audio type you play first (Google/OpenAI/Readio) .. the 2nd one will never be played, no Crash, but speaker silent

IMPORTANT: All works well with ESP with PSRAM (e.g. ESP32 Wrover DevKit), both issue do not appear if PSRAM exist and enabled.

I tested all with multiple AUDIO.H versions (including your latest ESP32-audioI2S-3.2.0g, also the 3.2.0 and earlier versions), also using latest arduino-esp32 library (3.2.0, based on ESP-IDF v5.4.1)

Long story sort: in summary it looks to me that maybe any detail in audio.cpp is adressing PSRAM memory even in case no PSRAM exist ? ... a/o running out of heap after audio event played succesfully ?.

Please let me know if you need any more details (or .log prints)!, I am very happy to follow up on any testing on my side to support you as best i can! .

Thank you @schreibfaul1 in prepare!!, 👍 hope it helps & it is not too much work in bug fixing :blush: (in case i missed anything or coded wrong let me know)

_

Below the code I am using, allowing you to reproduce This happens:

  • ESP32 with PSRAM play all 4 event perfectly, one by one.
  • Non-PSRAM ESP32 crashing after Audio 1 reached end -> Rebooting. If removing audio event 1 (SD) then Google TTS plays once, no CRASH .. all audio after Google are 'silent' because 'decoder not found')
#include "Arduino.h"
#include "WiFi.h"
#include "Audio.h"
#include "SD.h"
#include "FS.h"

// Digital I/O used
#define SD_CS          5    
#define SPI_MOSI      23
#define SPI_MISO      19
#define SPI_SCK       18

#define I2S_DOUT      25    
#define I2S_BCLK      27
#define I2S_LRC       26

Audio audio;

String ssid =       "...";     
String password =   "...";
String OPENAI_KEY = "..."; 


void setup() {
    pinMode(SD_CS, OUTPUT);  digitalWrite(SD_CS, HIGH);
    SPI.begin(SPI_SCK, SPI_MISO, SPI_MOSI);
    Serial.begin(115200);
    SD.begin(SD_CS);
    WiFi.disconnect();
    WiFi.mode(WIFI_STA);
    WiFi.begin(ssid.c_str(), password.c_str());
    while (WiFi.status() != WL_CONNECTED) delay(1500);
    
    audio.setPinout(I2S_BCLK, I2S_LRC, I2S_DOUT);
    audio.setVolume(5); // default 0...21    
   
    // *** AUDIO Event 1 - local files ***  
    Serial.println ("***** SD starting now ... *****");
    audio.connecttoFS(SD, "/welcome.wav");     // SD
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done 
   
    // *** AUDIO Event 2 - Google TTS ***   
    Serial.println ("***** Google TTS starting now ... *****");
    audio.connecttospeech("Wenn die Hunde schlafen ist alles gut.", "de"); // Google TTS
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done
  
    // *** AUDIO Event 3 -  Open AI TTS ***    (just remove this event in case you don't have an API_KEY)
    // (details: 4th parameter/voice_instruct new since AUDIO.H v.3.1.0u)
    Serial.println ("***** Open AI starting now ... *****");
    audio.openai_speech(OPENAI_KEY, "tts-1", "Hallo, wie gehts ?", "", "onyx", "aac", "1");   
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done 
     
    // *** AUDIO Event 4 -  Radio streams ***  
    Serial.println ("***** Streaming starting now ... *****");
    audio.connecttohost("http://stream.antennethueringen.de/live/aac-64/stream.antennethueringen.de/"); // aac
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // forever        
}


void loop(){    
    audio.loop(); 
    vTaskDelay(1); 
}

Complete Output:

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0030,len:4888
load:0x40078000,len:16516
load:0x40080400,len:4
load:0x40080404,len:3476
entry 0x400805b4
info        audioI2S Version 3.2.0g
***** SD starting now ... *****
info        PSRAM not found, inputBufferSize: 13951 bytes
info        buffers freed, free Heap: 109416 bytes
info        Reading file: "/welcome.wav"
info        FormatCode: 1
info        DataRate: 96000
info        DataBlockSize: 4
info        BitsPerSample: 16
info        Audio-Length: 258516
info        stream ready
info        syncword found at pos 0
info        Channels: 2
info        SampleRate: 24000
info        BitsPerSample: 16
info        BitRate: 768000
info        Closing audio file "welcome.wav"
CORRUPT HEAP: Bad head at 0x3ffefac4. Expected 0xabba1234 got 0x00000000

assert failed: multi_heap_free multi_heap_poisoning.c:279 (head != NULL)

Backtrace: 0x400828bc:0x3ffb1ef0 0x4008dee1:0x3ffb1f10 0x40094085:0x3ffb1f30 0x40092d8b:0x3ffb2070 0x40083bcb:0x3ffb2090 0x400940cd:0x3ffb20b0 0x4008968b:0x3ffb20d0 0x400896ef:0x3ffb20f0 0x40102026:0x3ffb2110 0x40101e95:0x3ffb2130 0x400d7940:0x3ffb2160 0x400e01be:0x3ffb2190 0x400e438d:0x3ffb21b0 0x400d2de7:0x3ffb21d0 0x40105fcf:0x3ffb2270 0x4008eb36:0x3ffb2290

ELF file SHA256: d525551c2
Rebooting...

Settings of my ESP32 Wroom NodeMU Devkit (no PSRAM):

Image

kaloprojects avatar May 23 '25 18:05 kaloprojects

Hi @schreibfaul1 Wolle,

I also saw a crash with Pink-Panther.wav. Maybe it is related:

I traced the crash to:

Audio.h line 117: size_t m_resBuffSizeRAM = 2048;

Audio.h line const size_t m_frameSizeWav = 4096;

When playing a WAV file, the reserve-buffer is assumed to be 4096, but the real size is only 2048.

I think that the crash is caused by buffer overflow.

Changing m_resBuffSizeRAM to 4096 works for me, but I don't know the impact otherwise.

BTW - I am using ESP32 WROOM presently, so no PSRAM.

p-jean avatar May 23 '25 22:05 p-jean

Hello @kaloprojects , hello @p-jean, absolutely correct, size_t m_resBuffSizeRAM must be increased. Especially with larger projects, a lack of memory can quickly occur without PSRAM. I hope everything is still working.

schreibfaul1 avatar May 24 '25 09:05 schreibfaul1

thank you both @schreibfaul1 and @p-jean for the help and fast response - great !

@schreibfaul1 thanks a lot for the fast code update. So I spent some hours again to test all scanarios again with your updated code ... tested with PSRAM (ESP32 Wrover DevKit) and without (ESP32 Wroom Devkit) -> GOOD and less good news ;)

Issue 1 - crashing on 'audio.connecttoFS(SD..)' .. .. is solved 😊 ! no longer any crashes, audio from file working well .. with and without PSRAM, cool ! 👍

Issue 2 - silent audio (no crashes but no audio playing) ... is NOT solved, even a bit more worse (because now OpenAI no longer working, even with PSRAM). (let's ignore the Open AI issue for now, you can't test anyhow/ API key needed .. and i am pretty sure it will work as soon we fixed the open radio streaming issue below)

Here my simplified code snippet i tested again with code update from @schreibfaul1 today .. with and without PSRAM:

void setup() {
    ...
    // *** AUDIO Event 1 - local files ***  
    Serial.println ("***** SD starting now ... *****");
    audio.connecttoFS(SD, "/welcome.wav");     // SD
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done 
   
    // *** AUDIO Event 2 - Google TTS ***   
    Serial.println ("***** Google TTS starting now ... *****");
    audio.connecttospeech("Wenn die Hunde schlafen ist alles gut.", "de"); // Google TTS
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done 
  
    // *** AUDIO Event 3A -  Open AI TTS ***    
    // (details: 4th parameter/voice_instruct new since AUDIO.H v.3.1.0u)
    Serial.println ("***** Open AI starting now ... *****");
    audio.openai_speech(OPENAI_KEY, "tts-1", "Hallo, wie gehts ?", "", "onyx", "aac", "1");   
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done 

    // *** AUDIO Event 3B -  again Open AI TTS (just for testing) ***    
    // (details: 4th parameter/voice_instruct new since AUDIO.H v.3.1.0u)
    Serial.println ("***** Open AI starting now ... *****");
    audio.openai_speech(OPENAI_KEY, "tts-1", "Wie war deine Woche ?", "", "onyx", "aac", "1");   
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // waiting until done  
       
    // *** AUDIO Event 4 -  Radio streams ***  
    Serial.println ("***** Streaming starting now ... *****");
    audio.connecttohost("http://stream.antennethueringen.de/live/aac-64/stream.antennethueringen.de/"); // aac
    while (audio.isRunning()) { audio.loop(); vTaskDelay(1); }  // forever        
}

result on ESP32 without PSRAM:

info        audioI2S Version 3.2.0g
***** SD starting now ... *****
info        PSRAM not found, inputBufferSize: 11903 bytes
info        buffers freed, free Heap: 109416 bytes
info        Reading file: "/chimes.wav"
info        FormatCode: 1
info        DataRate: 176400
info        DataBlockSize: 4
info        BitsPerSample: 16
info        Audio-Length: 216276
info        stream ready
info        syncword found at pos 0
info        Channels: 2
info        SampleRate: 44100
info        BitsPerSample: 16
info        BitRate: 1411200
info        Closing audio file "chimes.wav"
info        End of file "chimes.wav"
***** Google TTS starting now ... *****
info        buffers freed, free Heap: 109276 bytes
info        connect to "translate.google.com.vn"
info        chunked data transfer
info        MP3Decoder has been initialized, free Heap: 78868 bytes , free stack 5580 DWORDs
info        file has no ID3 tag, skip metadata
info        Audio-Length: 23232
info        Webfile: stream ready, buffer filled in 12 ms
info        syncword found at pos 0
info        MPEG-1, Layer I
info        Channels: 1
info        SampleRate: 24000
info        BitsPerSample: 16
info        BitRate: 64000
info        End of speech "Wenn die Hunde schlafen ist alles gut."
***** Open AI starting now ... *****
info        buffers freed, free Heap: 107996 bytes
info        Connect to: "api.openai.com"
info        SSL has been established in 812 ms, free Heap: 65228 bytes
info        The AACDecoder could not be initialized
***** Open AI starting now ... *****
info        buffers freed, free Heap: 108252 bytes
info        Connect to: "api.openai.com"
info        SSL has been established in 780 ms, free Heap: 65028 bytes
info        The AACDecoder could not be initialized
***** Streaming starting now ... *****
info        buffers freed, free Heap: 107764 bytes
info        connect to: "stream.antennethueringen.de" on port 80 path "/live/aac-64/stream.antennethueringen.de/"
info        Connection has been established in 87 ms, free Heap: 106496 bytes
info        redirect to new host "https://frontend.streamonkey.net/antthue-mitte?aggregator=rewrite/stream.antennethueringen.de/"
info        buffers freed, free Heap: 107052 bytes
info        connect to: "frontend.streamonkey.net" on port 443 path "/antthue-mitte?aggregator=rewrite/stream.antennethueringen.de/"
info        SSL has been established in 500 ms, free Heap: 62468 bytes
info        redirect to new host "https://edge15.streamonkey.net/antthue-mitte?aggregator=rewrite/stream.antennethueringen.de/"
info        buffers freed, free Heap: 107380 bytes
info        connect to: "edge15.streamonkey.net" on port 443 path "/antthue-mitte?aggregator=rewrite/stream.antennethueringen.de/"
info        SSL has been established in 514 ms, free Heap: 62504 bytes
info        The AACDecoder could not be initialized

=> no crashes any longer (issue 1 solved 😊 ) .. SD card file + Google TTS working perfectly now (incl. audio), but no audio at all on any streaming. It still fails (same before) with the final line The AACDecoder could not be initialized

For your own tests (in case you don't have any OpenAI API key ... just remove audio 3A ad 3B in my sketch above (Audio event 4 still not working in my testing). Could you @p-jean verify once on your ESP32 Wroom (i use same) ?

So the question for now: why fails the audio.connecttohost("http://stream.antennethueringen.de/live/aac-64/stream.antennethueringen.de/"); on a ESP32 without PSRAM ? .. hope you both have some idea ;)

Thx ! .. let me know if you need any details on testing

kaloprojects avatar May 24 '25 17:05 kaloprojects

It has changed a lot in the area of "chunked datatransfer". Unfortunately, OpenAI Speech stopped working after that. I have now built a server that simulates the OpenAI transmission and hope that it will work again.

Since last summer there has been a new AAC decoder, FAAD2 has replaced the old Helix decoder. FAAD2 requires more resources, without PSRAM AAC will no longer work.

It is better not to use the ESP32 any more. In the current version you can use the ESP32-P4, WiFi and Ethernet run without problems with this board.

Waveshare ESP32-P4 Module Dev Kit Image

schreibfaul1 avatar May 25 '25 18:05 schreibfaul1

Hi @schreibfaul1 ,

just was a bit 'shocked' when i read the post, just want to double check if i got it right. fyi: jumping over to a P4 is not a option in my use case because many devices (with self made printed circuits) are built already, so i need to use the existing hardware using ESP32 Wroom / just having an SD card reader (instead PSRAM).

I got your point to AAC decoder.. so i hope any other format at least supported, e.g mp3 ? fyi: i just tested this stream (one of my favorites, daliy news), also not working: (audio.connecttohost("https:// icecast.tagesschau.de/ndr/tagesschau24/live/mp3/128/stream.mp3")

Most important question ;) .. because the missed audio.openai_speech() would be a kind of knock out for all my projects (for sure the main reason why i use and love your AUDIO.H 😊 ) .. what did you mean with I have now built a server that simulates the OpenAI transmission and hope that it will work again. ? <- do you mean you built a server to test yourself for own tests ... or anything needed from me or other ?

fyi, just checked Open AI server can send audio in other formats than aac too: supported formats for '.openai_speech()': aac | mp3 | wav (sample rate issue) | flac (PSRAM needed) (i just checked mp3 once, 5 mins ago ... also same message, decoder initalize fails

***** Open AI starting now ... *****
info        buffers freed, free Heap: 108256 bytes
info        Connect to: "api.openai.com"
info        SSL has been established in 772 ms, free Heap: 65236 bytes
info        The MP3Decoder could not be initialized

i cross my finger you get the OpenAI TTS call running again on existing ESP32 (without PSRAM) ✨ !? .. my only migration would be for me that i use any older AUDIO.H versions ? (would be a pain too, as i appreciate all your cool new features/updates always)

Thx again for all your work and effort in answering user posts 👍

kaloprojects avatar May 25 '25 19:05 kaloprojects

That's lack of memory, I guess. mbedtls for SSL needs about 40..50KB Maybe it works again if you take that back

Image

schreibfaul1 avatar May 25 '25 19:05 schreibfaul1

Hi @schreibfaul1 : i tested taking the update from @p-jean back .. took back the 4096 once to earlier 2028 for testing.. but then the SD plays audio audio.connecttoFS(SD, "/welcome.wav"); .. after playing it crashes and reboots after file reached end (exactly issue 1 from my testing in beginning . 😓 So it looks like this was needed still.

I have no good idea in moment :( .. all i can do now is to test all local copies i have from your AUDIO.H .. i have to find a older AUDIO.H version where at least this Duo works: audio.openai_speech() + audio.connecttohost("https://...mp3") on my existing printed circuits, ESP32 Wroom & SD card (no PSRAM). fyi: Also needed for techiesms' pcb .. he used my code until today 1:1, but now my Voice project code update for him not woking any longer (pcb has no PSRAM, and I don't know which AUDIO.H version he should use for Open AI)

  • SD card access audio.connecttoFS(SD, ..); without crashing`would be great .. but not mandatory (this i could skip if needed, bc. i need the SD card for my own audio recordng only, not for playing)
  • audio.openai_speech() : Most important, I can test with mp3 or acc, both would be ok. , sure i love the latest version with @akdeb 's voice instruction update .. and your @schreibfaul1 's great isRunning() fix weeks ago (this works perfect!). But better an old (working) Open AI that works also on a non PSRAM ESP32 .. than a 'silent' one ;)
  • Streaming of mp3, audio.connecttohost("https://...mp3"): mp3 would be enough (no aac needed), reason: with this function i can call Speechgen IO server (as add on to OpenAI TTS), they also sending an mp3 url. Btw: Google TTS is not needed, pretty useless for real human chat bots .. i just use it in my scripts for testing your AUDIO.H

Let me also ask @akdeb as he made a great update with voice instruction: Akash do you have any good idea which latest AUDIO.H versions worked well on audio.openai_speech() on an ESP32 without PSRAM ?. Background: looks like that the current audio.openai_speech() no longer working without PSRAM (which is a nightmare for me LOL .. as my printend circuits and also techiesms pcb no longer working). Your Elato devices using an ESP32-S3, right ?

So @schreibfaul1 .. let me dig in deeper with testing several versions of your AUDIO.H, might cost me 1-2 days (testing OpenAI + Streaming with mp3/aac/ with&without PSRAM, latest Espressif IDF 3.2.0) .. but have no better idea in moment to find the root cause (at least the building date) why/which date AUDIO.H no longer supporting Open AI TTS + mp3 streaming without PSRAM.

Let me know if this helps a bit .. or anything particular i should test? Thank you!

kaloprojects avatar May 26 '25 14:05 kaloprojects

Hi @kaloprojects, there will probably only be a few bytes missing. Each Arduino version leaves different amounts of heap free. I have hardly changed anything recently that requires significant memory. The Arduino stack is set to 8k (DWORDS). Perhaps it can be reduced slightly in your project. A reduction of 1k brings 4kB additional heap. SET_LOOP_TASK_STACK_SIZE(7 * 1024);

Audio without PSRAM is hardly possible, especially with codecs like VORBIS and OPUS. OGG and FLAC can have frames up to a size of 64KB.

In the next version, everything will be scaled to 48KHz at the output of the I2S. This is necessary to be able to address Bluetooth devices directly and some newer DACs also require this. And this is not possible without PSRAM.

schreibfaul1 avatar May 26 '25 19:05 schreibfaul1

Hi @schreibfaul1 : i was just writing in parrallel (sorry) ... i found it ! :heart_eyes: :sparkling_heart: .. version 3.0.11g (July 18, 2024) is the last version which works perfect on non PSRAM (Open AI, aac streaming etc.) great ! .. the published 3.0. 12 (published 5 days later: fails !, no OpenAI any longer, also no aac streaming working any longer)

so my workaround would be for all user for NOW (totally fine for me, happy about): everybody with PSRAM can (and should) use your latest libraries, and for the non-PSRAM user (of my projects) i would recommend to go with 3.0.11g (as i told them in past anyhow. The only pain i have: ... there is no 3.0.11g published on your github, right ?? .. i can find the 3.0.12 link only (the 3.0.11g i just have a local zip backup only)

I will post more testing details tomorrow ! (happy for now that i found a workaround at least 👍 😊 ) and thx meanwhile for your post .. have to jump out .. just in hurry, back tomorrow.

Thank you for the moment 👍 (i will test your idea immediately when i am back online tomorrow 🚀 )

kaloprojects avatar May 26 '25 19:05 kaloprojects

The internal audio task was added exactly in July 2024. The stack 3300 DWORDS is in the heap and no longer in the Arduino stack. The Arduino stack can then be reduced by this amount.

schreibfaul1 avatar May 26 '25 19:05 schreibfaul1

Hi @Schreibfaul1, sorry for the late response, but i had to spend some hours the last days in testing your SET_LOOP_TASK_STACK_SIZE(7 * 1024) idea in detail. Fyi (I not want to bore you with details): I tested all AUDIO.H version I have (also tried 6K once instead 7K) .. from 3.0.11g (with the important 8bit wav fix) to 3.0.12, 3.1.0, ... 3.1.0w (with the nice isRunning fix and Akash nice OpenAI voice instruction feature) .. until your latest 3.2.0i.

Long story short: the `SET_LOOP_TASK_STACK_SIZE(7 * 1024) did help 'a bit' on older 3.0.12 only .. but all later versions still fail ('fail' in sense: Open AI no longer speaking). And I do feel (I might be wrong) that this STACK reduction has some side effects (e.g. on parallel WifiClientSedure instances, some LLM connection seem less stable after this stack reduction)

So my final idea .. driven from the fact that future versions might need PSRAM anyhow .. AND the fact that I do not want to bore you any longer 😉 with this 'Non-PSRAM' + Open AI issue … we could finally think about this:

  • I will let my user know that PSRAM will be needed for any further updates
  • And to the existing user base (with non-PSRAM ESP32) I would recommend to use the latest (perfect working!) version 3.0.11g.

If you agree: Could you maybe add the 3.0.11g with a link to your version release page ? I do feel this release might be a very important for many of your AUDIO.H user, because it is the last version which works perfectly for the non-PSRAM user, (at least in all of my tests) right ? 3.0.11g never (!) failed in any of my tests, no STACK or other modification needed at all, Open AI working perfect always. The published 3.0.12 is the first one which fails already on ESP32 without PSRAM. (You could even name the 3.0.11g to 3.0.11 .. in case you hate the boring ‘..g’ postfix ;). In case you lost this version yourself/smilie .. here your original .zip file (from July 18, 2024) on my private server (link removed)

Is this ok for you ? .. (asking as I do not want to share your zip myself, is it your great work!). And for any further updates we all will use PSRAM anyhow (that’s my take-way, learned from you)

and a final word: THANK YOU so much for your great support, i really appreciate this Wolle .. great work you are doing here day-by-day 👍

kaloprojects avatar May 28 '25 17:05 kaloprojects

Dear @kaloprojects , I have used your uploaded lib 3.0.11, but It seems that memory leaks when playback mp3 from SPIFFS.

So could you help me check it again! Best regard,

Ref: https://github.com/schreibfaul1/ESP32-audioI2S/issues/1043#issuecomment-2914868655

nhoc20170861 avatar May 29 '25 12:05 nhoc20170861

Dear @kaloprojects , I have used your uploaded lib 3.0.11, but It seems that memory when playback mp3 from SPIFFS.

So could you help me check it again! Best regard,

Ref: #1043 (comment)

@nhoc20170861, Hi .. as mentioned the 3.0.11g (in my private link above) works perfect with SD card (i never tested SPIFFS bc. is faar to slow and critical for many I2S audio use cases .. so i used real SD cards only in past and PSRAM in future). ALL above 3.0.11g (also 3.0.12nn in my tests!) fail already on any OpenAI TTS calls and many mp3&aac streaming services!. Sorry saying this. Btw: you should try the SET_LOOP_TASK_STACK_SIZE(7 * 1024) once in 3.0.12, in 3.0.11g not needed, in 3.0.12 it was needed in my tests. Maybe this solves your issue too.

But pls. do me a favor NOT sharing the 'private link version' further, it was more a help for @schreibfaul1 so he can decide if he adds it in his version history page first (i added the link only in case he lost his 3.0.11g diamond ;). So pls. stay tuned until he posted 3.0.11g officially (and we can follow up later on), is this ok ? - Thanks ;)!

kaloprojects avatar May 29 '25 16:05 kaloprojects

Dear @kaloprojects , I try to test with lib 3.0.11g you attached. I test with connectToFS and connectToSpeech.

i found it works fine but the sound is choppy. then i tried with release 3.0.12 (in fact 3.0.12e) it also has no memory leak (instead of playing the next audio file in the eof_audio_mp3 function i checked audio.isRunning() == false then play the next track, it accidentally doesn't leak memory) and the sound is better. Best regard

nhoc20170861 avatar May 30 '25 00:05 nhoc20170861

Dear @kaloprojects , I try to test with lib 3.0.11g you attached. I test with connectToFS and connectToSpeech.

i found it works fine but the sound is choppy. then i tried with release 3.0.12 (in fact 3.0.12e) it also has no memory leak (instead of playing the next audio file in the eof_audio_mp3 function i checked audio.isRunning() == false then play the next track, it accidentally doesn't leak memory) and the sound is better. Best regard

Hi, yes, sounds seems to become better from each version to each version, agree, and I do feel it sounds best anyhow with PSRAM only (btw: also some distortions on lower volumes below max 21). The problem is: with 3.0.12 and later (to be exact: since 3.0.11u in my tests) the openai_speech() stopped working in total on ESP's without PSRAM , that's why no Open AI TTS projects can be built any longer (and my user folks depend on this non published 3.0.11 ;)

kaloprojects avatar May 30 '25 06:05 kaloprojects

sorry .. i closed thread for a moment by accident (LOL)

kaloprojects avatar May 30 '25 06:05 kaloprojects

Hello @kaloprojects, I saw the link above to your voice assistant and the circuit board created for it. That must have taken a lot of time and work. The ESP32 (without PSRAM) has been available to the masses for 7 years and replaced the 8266 at that time. I think it is time to use the ESP32-S3 N16R8 instead. This also gives you as a developer more freedom for stable software with an extended range of functions.

schreibfaul1 avatar May 30 '25 08:05 schreibfaul1

Hello @kaloprojects, I saw the link above to your voice assistant and the circuit board created for it. That must have taken a lot of time and work. The ESP32 (without PSRAM) has been available to the masses for 7 years and replaced the 8266 at that time. I think it is time to use the ESP32-S3 N16R8 instead. This also gives you as a developer more freedom for stable software with an extended range of functions.

Hi @schreibfaul1 .. thx for response 😊 . Yes, you are totally right, that's what i learned from YOU meanwhile .. any further projects (including any upcoming circuit boards) i will go with PSRAM only ! 👍

  • PSRAM seems state of the art meanwhile,
  • also your AUDIO.H library working best,
  • sound quality sounds best,
  • supported further from you (this counts!) ..
  • AND no longer any SD card needed (which makes all of my further pcb/printed circuits easier) 👍 .

fyi: i will update my project code on weekend anyhow. Lot new AI features .. and most important: Project now supports PSRAM for the complete workflow (WAV audio Recording> STT transcrition> AI bot LLM /websearch > OpenAI voices via your AUDIO.H etc..). Each of this step now possible with PSRAM buffer. And i made my upcoming code backwards compatible (user can decide via #define whether PSRAM or pcb with SD card used for complete workflow 😊)

The reason why i was asking for the 3.0.11g link (in your history) is much more simple: The existing non-PSRAM user base need 3.0.11g (for the current published and my new upcoming code) .. and i can't find this version in your release history. So how can i share best ? (a bit funny if i tell them: 'use the version 3.0.11g on existing pcbs .. but i don't know where to find on Schreibfauls1 github LOL ;) . Do you have any idea how to share to user ?, I appreciate to follow your goals.

kaloprojects avatar May 30 '25 09:05 kaloprojects

The latest version with the old Helix AAC decoder is 3.0.12f, which is the one where aac does not yet require PSRAM.

in platformio.ini lib_deps = https://github.com/schreibfaul1/ESP32-audioI2S.git#36cfa4c

oder https://github.com/schreibfaul1/ESP32-audioI2S/commit/36cfa4caf4ef9465136609c9da439a612790429f

schreibfaul1 avatar May 30 '25 10:05 schreibfaul1

thank you @schreibfaul1 .. happy to test (as alternative to 3.0.11g).. ... but excuse me if i ask a dummy question: where can I (or any other Arduino IDE user) download the complete 3.0.12.f zip file ?

https://github.com/schreibfaul1/ESP32-audioI2S#36cfa4c brings me to your homepage only (with latest 3.2.0i, not the 3.0.12 f .. and 36cfa4c shows me the 2 files only .cpp/.h)

kaloprojects avatar May 30 '25 11:05 kaloprojects

No problem :-))

Image

Image

schreibfaul1 avatar May 30 '25 11:05 schreibfaul1

oohmggr .. i hate this Arduino IDE :weary: .. and i love your support 😍 ! I will test, thx 👍

kaloprojects avatar May 30 '25 12:05 kaloprojects

Great feature in github .. to come from branch ID (here https://github.com/schreibfaul1/ESP32-audioI2S/commit/36cfa4caf4ef9465136609c9da439a612790429f) to the complete library zip version, cool !

Now i am curious .. hopefully not boring you .. how can i find the latest branch ID which was in 3.0.11g ? then i could create & upload this complete 3.0.11g zip too. Because this is the version i am searching without success since days (lol)

kaloprojects avatar May 30 '25 12:05 kaloprojects

Dear @schreibfaul1 i found that your commit https://github.com/schreibfaul1/ESP32-audioI2S/commit/36cfa4caf4ef9465136609c9da439a612790429f

use FreeRTOS to handle audio, but the release v3.0.12 does not use it (although it has tag 3.0.12e)

So could you tell me which better version lib without memory leak (for ESP board doesn't have PSRAM) Thank you very much

nhoc20170861 avatar May 30 '25 14:05 nhoc20170861

Dear @schreibfaul1 i found that your commit 36cfa4c

use FreeRTOS to handle audio, but the release v3.0.12 does not use it (although it has tag 3.0.12e)

So could you tell me which better version lib without memory leak (for ESP board doesn't have PSRAM) Thank you very much

Hi @nhoc20170861 , wouldn't it be better if you follow up on your memory leak issue in your other thread memory leak ? I do not want to confuse @schreibfaul1 .. because this thread here (i opened) is related ONLY to the issue that OpenAI speech no longer working on non PSRAM later 3.0.11g (none of my community has any memory leak issue, neither me). So i appreciate your input, don't get me wrong ;).. but on this thread i honestly want to focus on the original subject only .. getting a version with zip link to latest non-PSRAM version where openai_speech() works (means not 'crashing'/issue1 and 'not slient'/issue2) .. and this is the 3.0.11g ! (at least in all of my tests)

I do not want to waste Schreibfauls time any longer with bug fixing or root cause research (well knowing and understanding that ESP32 without PSRAM has no future at all) .. he spent so much time on this already .. and i am more than thankful for his help 💫 👍 .. i finally asked him above only how to build a 3.0.11g zip with his screenshot tricks above ;)

kaloprojects avatar May 30 '25 15:05 kaloprojects

The latest version with the old Helix AAC decoder is 3.0.12f, which is the one where aac does not yet require PSRAM.

in platformio.ini lib_deps = https://github.com/schreibfaul1/ESP32-audioI2S.git#36cfa4c

oder 36cfa4c

Hi @schreibfaul1 , just wanted to close the issue .. but after detailed testing) on my old ESP32 without PSRAM: Unfortunaltelly your mentioned 3.0.12f is 'working' but not same reliable and stable as your 'diamond' 3.0.11g (detailed: Open AI TTS often fails in speaking with 3.0.12f, in 3.0.11g speaking works always. Also some WifiCientSecure connecting/memory issues with .12f). So i would avoid to share the 12f this with my user community (for the non-PSRAM user) .. i still would prefer to recommend them the stable 3.0.11g. So could you please tell me a link with same screenshot tricks above to get the 3.0.11g version ? (the commit 36cfa4 happend later). btw: the 3.0.11g included your great 8 bit wav fix (which was mandatory and working great with latest 3.0.11g)

I just need a link to create a zip download to this version:

Image

.. or is it NO longer possible on your github site to create & download this zip ?? (I was hoping that i just need the right change commit #/ earlier 36cfa4 ?) . Here again a mirror on my own private server / link i won't share around): (link removed), so you can see the complete zip i am searching for a link on your side

crossing finger ;) thanks in prepare for your answer (with a link hopefully ;)

_ Btw: best at all for ALL of your non-PSRAM user would be of course to find this version with a dedicated entry in your version release page. Because this 3.0.11g AUDIO.H version seems the last one which works flawless on non PSRAM ESP's. But i totally understand if you do not want to add a odd named version on release page ;) .. that's why i was asking how i can get the link (to share further)

kaloprojects avatar Jun 02 '25 11:06 kaloprojects

Hi Kalo, maybe there was no version 3.0.11g, I did not find it in the version management.

Alternative: https://github.com/schreibfaul1/ESP32-audioI2S/tree/master/additional_info/Helix%20AAC%20Decoder/aac_decoder

You can replace the faad2 decoder with the Helix AAC decoder, replace everything that is in the aac-decoder folder. This is also possible with the version 3.2.1.

schreibfaul1 avatar Jun 02 '25 14:06 schreibfaul1

cool idea 👍 , Thx!. i will test immediately.

kaloprojects avatar Jun 02 '25 14:06 kaloprojects

Thank you @schreibfaul1 for all you effort helping us/me on this !

I tested your idea, used latest 3.2.1 and replaced the faad2 decoder with the older Helix (using the 2 files from your link above). Result: Unfortunately not working, still a 'silent' Open AI (no audio, also no crashes/error, but the well known info 'The AACDecoder could not be initialized'

So to make our all life easier .. i just added a github sub folder library_archive in my Github Project and uploaded your always perfect working 3.0.11g zip file there, so all the user with 'old ESP32 / no PSRAM' can use it. It is 1:1 a copy from your library at this time. Fyi, i did 2 modifications only;

  • updated the library.properties and library.json header
  • and removed the 4 largest audio files in ..\additional_info\Testfiles (this reduced zip size from 29MB to 14MB, reason: github allow max 25MB for file uploads)

Fyi: i will let my folks know (when i upload my next version of ESP32-Voice-Assistant / supporting PSRAM now / will be published in few days), that this 3.0.11g zip is intended for non-PSRAM user only, PSRAM user should use your latest AUDIO.H always further on

So let me know (as it is YOUR great work :wink: 👍 ) if you are ok when i keep a link/zip on my github? .. if yes, let's close this issue #1039 finally with 'successfully solved' 😊

kaloprojects avatar Jun 03 '25 08:06 kaloprojects

Hi Wolle @schreibfaul1, just wanted to send you the final THANK YOU. Today i uploaded my latest ESP32-Voice-ChatGPT code, supporting newer ESP32 or ESP-S3 with PSRAM and the older ESP32 without PSRAM (and SD card instead).

So if you ever want to see how flawless your older 3.0.11g version of AUDIO.H works (on ESP32 without PSRAM) .. here is a video ;) .. video link. Workflow: I2S Audio recording wav file into PSRAM -> SpeechToText -> Open AI LLM chat: All 3 steps coded myself (no AUDIO.H or other libs needed). Using your AUDIO.H for the final I2S audio Open AI TTS output audio.openai_speech() (also using your audio.connecttoFS(SD,..) for the tiny welcome 'gong' sound from SD card on power on).

And of course (as you mentioned) with PSRAM (and latest AUDIO.H) it works best anyhow 😊

So thx for you earlier help finally ! closing this thread.

= i have some further ideas/questions (playing audio wav from PSRAM buffer .. instead from SD card).. but will open a new separate issue thread for this next.

kaloprojects avatar Jun 21 '25 15:06 kaloprojects