Prince Canuma comments

Results 572 comments of


                                            Prince Canuma

Add IndexTTS

Awesome, thanks for the fix! One more thing I just noticed. The converted model has the codec weights embedded within it. I typically keep codec weights separate from the main...

> I think [MLX only quantizes modules that implement to_quantized(group_size, bits)](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.quantize.html#mlx.core.quantize). However, BigVGAN codec doesn't contain any Linear nor Embedding. And [Conv1d layers doesn't have to_quantized method](https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary/mlx.nn.Conv1d.html), so in theory...

Add IndexTTS

> I just included the original wetext normalizer, but please let me know if you prefer otherwise! Great, I like the normalizer! However, I wonder if it's possible to normalize...

Add IndexTTS

No worries, for now @senstella we can remove `wetext` since we're not using it anywhere as far as I can see and merge this PR. In the meantime, @yarshure and...

Add MKDocs for mlx-audio

Aweome, thanks @nlauchande!

Bug Report: Swift-TTS-iOS Crash Due to Excessive Memory Usage

Yeah, we recently fixed memory spikes here #165 Please give it a try and let me know if the issues continue I'm not sure limiting memory usage is the best...

any examples how to use the spark model ? thanks for this amazing tool to use on mac

Voice cloning: ```bash python -m mlx_audio.tts.generate --model Spark-TTS-0.5B-6bit --text " Get started today, P I P install M L X dash audio" --play --file_prefix spark --sample_rate 16000 --pitch 1.0 --speed...

any examples how to use the spark model ? thanks for this amazing tool to use on mac

> [@Blaizzy](https://github.com/Blaizzy) When I use the command " python -m mlx_audio.tts.generate --model mlx-community/Spark-TTS-0.5B-fp16 --text "嗯，今天是个特别的日子，天气嘛，大概是23度左右，挺舒服的。Well… it's sunny and bright, perfect for a walk, don’t you think? 顺便来学一句日语吧：「おはようございます」，也就是“早上好”的意思。然后是韩语：「안녕하세요」，就是“你好”。Now, let’s do a...

KeyError: 'image_token_index' when training with LoRA on Qwen2.5-VL-3B-Instruct-bf16

Hey @mathav95raj Could you change the image token from to the one Qwen 2.5 VL uses? You can find examples here: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct Let me know if that fixes the issue....

KeyError: 'image_token_index' when training with LoRA on Qwen2.5-VL-3B-Instruct-bf16

Hey guys, We are refactoring the entire training pipeline. Check out #261 by @Goekdeniz-Guelmez