Crash when running Dia-1.6B
I installed the latest version this evening: pip3.11 install git+https://github.com/Blaizzy/mlx-audio.git at commit https://github.com/Blaizzy/mlx-audio/commit/021586f92d30fbbf6e6b4fa27c83297837fbad4c
Running the example went well:
projects % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Dia is a new open weights model from Nari labs. [S2] Wow, crazy. (laughs) [S1] Yeah, it's now available on M L X as well." --sample_rate 44100 --play
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 16163.02it/s]
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 31300.78it/s]
Model: mlx-community/Dia-1.6B
Text: [S1] Dia is a new open weights model from Nari labs. [S2] Wow, crazy. (laughs) [S1] Yeah, it's now available on M L X as well.
Voice: None
Speed: 1.0x
Language: a
21%|██████▍ | 653/3072 [00:22<01:30, 26.79it/s]EOS detected at step 654 for channel 0
22%|██████▋ | 683/3072 [00:22<01:19, 30.23it/s]
==========
Duration: 00:00:07.571
Samples/sec: 13969.9
Prompt: 333899 tokens, 13969.9 tokens-per-sec
Audio: 333899 samples, 13969.9 samples-per-sec
Real-time factor: 0.32x
Processing time: 23.90s
Peak memory usage: 9.04GB
✅ Audio successfully generated and saving as: audio_000.wav
When I changed some of the input and ran again, it crashed partway through. The MacBook Pro screen and external screen went off. Keyboard light remained on. I held down the power button to reboot. Here is the trace:
projects % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that" --sample_rate 44100 --play
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 21454.24it/s]
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 34521.02it/s]
Model: mlx-community/Dia-1.6B
Text: [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that
Voice: None
Speed: 1.0x
Language: a
11%|███▏ | 324/3072 [00:10<01:26, 31.83it/s]
[Restored Apr 27, 2025 at 9:43:53 PM]
Last login: Sun Apr 27 21:43:50 on console
Restored session: Sun Apr 27 21:42:50 MDT 2025
This test run on a MacBook Pro on M4 Pro chip with 48GB.
Thanks!
Hey @samreid
We patched all models to fix memory spikes here #164
Please check it out and let us know if the issue persists
Excellent, thanks, I think it is fixed. I ran
pip3.11 install git+https://github.com/Blaizzy/mlx-audio.git
when main was at 1eb879e7a6a47e327c5f03d04a923319f92ccd29 then
audio-project-6 % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that" --play
Fetching 2 files: 100%|████████████████████| 2/2 [00:00<00:00, 23301.69it/s]
Fetching 2 files: 100%|████████████████████| 2/2 [00:00<00:00, 32768.00it/s]
Model: mlx-community/Dia-1.6B
Text: [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that
Voice: None
Speed: 1.0x
Language: a
54%|███████████████████▌ | 653/1200 [00:19<00:16, 33.46it/s]EOS detected at step 653 for channel 0
57%|████████████████████▍ | 682/1200 [00:19<00:14, 35.30it/s]
Starting audio stream...
✅ Audio successfully generated and saving as: audio_000.wav
It spoke and output to audio_000.wav [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) but did not output the second S1 saying "Yeah, whatever...". I do not know if that is a known problem or by design (not going back and forth).
The crash is fixed though, thanks! Have a nice day.
I think because you have to end with S2 and not S1.
Try this approach and let me know if the issue persists.
My pleasure! ❤️