mlx-audio icon indicating copy to clipboard operation
mlx-audio copied to clipboard

Crash when running Dia-1.6B

Open samreid opened this issue 10 months ago • 4 comments

I installed the latest version this evening: pip3.11 install git+https://github.com/Blaizzy/mlx-audio.git at commit https://github.com/Blaizzy/mlx-audio/commit/021586f92d30fbbf6e6b4fa27c83297837fbad4c

Running the example went well:

projects % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Dia is a new open weights model from Nari labs. [S2] Wow, crazy. (laughs) [S1] Yeah, it's now available on M L X as well." --sample_rate 44100 --play
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 16163.02it/s]
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 31300.78it/s]

Model: mlx-community/Dia-1.6B
Text: [S1] Dia is a new open weights model from Nari labs. [S2] Wow, crazy. (laughs) [S1] Yeah, it's now available on M L X as well.
Voice: None
Speed: 1.0x
Language: a
 21%|██████▍                       | 653/3072 [00:22<01:30, 26.79it/s]EOS detected at step 654 for channel 0
 22%|██████▋                       | 683/3072 [00:22<01:19, 30.23it/s]
==========
Duration:              00:00:07.571
Samples/sec:           13969.9
Prompt:                333899 tokens, 13969.9 tokens-per-sec
Audio:                 333899 samples, 13969.9 samples-per-sec
Real-time factor:      0.32x
Processing time:       23.90s
Peak memory usage:     9.04GB
✅ Audio successfully generated and saving as: audio_000.wav

When I changed some of the input and ran again, it crashed partway through. The MacBook Pro screen and external screen went off. Keyboard light remained on. I held down the power button to reboot. Here is the trace:

projects % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that" --sample_rate 44100 --play
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 21454.24it/s]
Fetching 2 files: 100%|██████████████| 2/2 [00:00<00:00, 34521.02it/s]

Model: mlx-community/Dia-1.6B
Text: [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that
Voice: None
Speed: 1.0x
Language: a
 11%|███▏                          | 324/3072 [00:10<01:26, 31.83it/s]
  [Restored Apr 27, 2025 at 9:43:53 PM]
Last login: Sun Apr 27 21:43:50 on console
Restored session: Sun Apr 27 21:42:50 MDT 2025

This test run on a MacBook Pro on M4 Pro chip with 48GB.

Thanks!

samreid avatar Apr 28 '25 03:04 samreid

Hey @samreid

We patched all models to fix memory spikes here #164

Please check it out and let us know if the issue persists

Blaizzy avatar May 24 '25 11:05 Blaizzy

Excellent, thanks, I think it is fixed. I ran

pip3.11 install git+https://github.com/Blaizzy/mlx-audio.git

when main was at 1eb879e7a6a47e327c5f03d04a923319f92ccd29 then

audio-project-6 % python3.11 -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that" --play           
Fetching 2 files: 100%|████████████████████| 2/2 [00:00<00:00, 23301.69it/s]
Fetching 2 files: 100%|████████████████████| 2/2 [00:00<00:00, 32768.00it/s]

Model: mlx-community/Dia-1.6B
Text: [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) [S1] Yeah, whatever, I knew you would say that
Voice: None
Speed: 1.0x
Language: a
 54%|███████████████████▌                | 653/1200 [00:19<00:16, 33.46it/s]EOS detected at step 653 for channel 0
 57%|████████████████████▍               | 682/1200 [00:19<00:14, 35.30it/s]
Starting audio stream...
✅ Audio successfully generated and saving as: audio_000.wav

It spoke and output to audio_000.wav [S1] Generate a novel haiku then speak it. [S2] Wow, no... (laughs) but did not output the second S1 saying "Yeah, whatever...". I do not know if that is a known problem or by design (not going back and forth).

The crash is fixed though, thanks! Have a nice day.

samreid avatar May 24 '25 14:05 samreid

I think because you have to end with S2 and not S1.

Try this approach and let me know if the issue persists.

Blaizzy avatar May 24 '25 15:05 Blaizzy

My pleasure! ❤️

Blaizzy avatar May 24 '25 15:05 Blaizzy