Prince Canuma
Prince Canuma
Awesome, thanks for the fix! One more thing I just noticed. The converted model has the codec weights embedded within it. I typically keep codec weights separate from the main...
> I think [MLX only quantizes modules that implement to_quantized(group_size, bits)](https://ml-explore.github.io/mlx/build/html/python/_autosummary/mlx.core.quantize.html#mlx.core.quantize). However, BigVGAN codec doesn't contain any Linear nor Embedding. And [Conv1d layers doesn't have to_quantized method](https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary/mlx.nn.Conv1d.html), so in theory...
> I just included the original wetext normalizer, but please let me know if you prefer otherwise! Great, I like the normalizer! However, I wonder if it's possible to normalize...
No worries, for now @senstella we can remove `wetext` since we're not using it anywhere as far as I can see and merge this PR. In the meantime, @yarshure and...
Aweome, thanks @nlauchande!
Yeah, we recently fixed memory spikes here #165 Please give it a try and let me know if the issues continue I'm not sure limiting memory usage is the best...
Voice cloning: ```bash python -m mlx_audio.tts.generate --model Spark-TTS-0.5B-6bit --text " Get started today, P I P install M L X dash audio" --play --file_prefix spark --sample_rate 16000 --pitch 1.0 --speed...
> [@Blaizzy](https://github.com/Blaizzy) When I use the command " python -m mlx_audio.tts.generate --model mlx-community/Spark-TTS-0.5B-fp16 --text "嗯,今天是个特别的日子,天气嘛,大概是23度左右,挺舒服的。Well… it's sunny and bright, perfect for a walk, don’t you think? 顺便来学一句日语吧:「おはようございます」,也就是“早上好”的意思。然后是韩语:「안녕하세요」,就是“你好”。Now, let’s do a...
Hey @mathav95raj Could you change the image token from to the one Qwen 2.5 VL uses? You can find examples here: https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct Let me know if that fixes the issue....
Hey guys, We are refactoring the entire training pipeline. Check out #261 by @Goekdeniz-Guelmez