Any hints to add new language support ?
Hi there, thanks for the impressive work, it works flawlessly in English!
I'm struggling to find information on how to add new language support, digged through original CSM repos and here but no clue. I'd like to add french support
(I'm also a begineer in these things so any tips is appreciated)
EDIT : Just seen that you will enable full language support for Kokoro (and it includes french). Still the voice is not that good compared to CSM / Orpheus.
Aside : While trying to use Orpheus, I get an error ValueError: Model type llama not supported.
Just here to say that I'm facing the same issue:
Error loading model: Model type llama not supported.
When trying:
python -m mlx_audio.tts.generate --model mlx-community/orpheus-3b-0.1-ft-bf16 --text "Hello world" --voice tara --temperature 0.6 --audio_format mp3
https://huggingface.co/mlx-community/orpheus-3b-0.1-ft-bf16
Could you share the whole trace back and which version of MLX-audio you are running?
Also, could you try installing from source and see if the issue persists?
@Blaizzy When running mlx-community/3b-ko-ft-research_release-6bit, the same issue occurs.
Here are the library versions used: mlx-audio : 0.0.3 (The same issue occurs even when using the main branch via pip install git+https://github.com/Blaizzy/mlx-audio.git@main) mlx-lm : 0.22.4 mlx : 0.24.2
Here is the error log:
% python -m mlx_audio.tts.generate --model mlx-community/3b-ko-ft-research_release-6bit --text "Hello, world"
Fetching 7 files: 100%|██████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 24818.37it/s]
ERROR:root:Model type llama not supported.
Error loading model: Model type llama not supported.
Traceback (most recent call last):
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/utils.py", line 30, in get_model_and_args
arch = importlib.import_module(f"mlx_audio.tts.models.{model_type}")
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/models/llama/__init__.py", line 1, in <module>
from .llama import Model, ModelConfig
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/models/llama/llama.py", line 13, in <module>
from mlx_lm.utils import stream_generate
ImportError: cannot import name 'stream_generate' from 'mlx_lm.utils' (/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_lm/utils.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/generate.py", line 92, in generate_audio
model = load_model(model_path=model_path)
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/utils.py", line 141, in load_model
model_class, model_type = get_model_and_args(model_type=model_type)
File "/Users/user/miniconda3/envs/llama4/lib/python3.10/site-packages/mlx_audio/tts/utils.py", line 34, in get_model_and_args
raise ValueError(msg)
ValueError: Model type llama not supported.
Using the latest main orpheus works using this command (only in python 3.11, tried 3.12 and got dependencies issues) :
After pull : pip install -r requirements.txt
Then :
python -m mlx_audio.tts.generate --model mlx-community/orpheus-3b-0.1-pretrained-bf16 --text "The quick brown fox jumps over the lazy dog." --play
But this doesnt't really helps to add new languages 😄
@swlee60 your command also works using latest main but gives really bad results (definitely not an Hello world) haha. Maybe it doesn't work because you are using python 3.10 instead of 3.11
@johann-taberlet It works using main, I just generated an hello world with the exact same command and it worked.
Thanks @m1m1s1ku!
Indeed, py3.11 is recommend
@m1m1s1ku @Blaizzy You're right. When running it on Python 3.11, the WAV file is generated correctly. Thank you!