Should Sampling Rate be Automatically Set with Default Sampling Rate Each Model Uses?
Is there a reason why sampling rate is settable by user? Don't all models have default sampling rate where they operate? If so, I wonder if the sampling rate should be returned by model when you load a model instead of making it set by user.
For example, spark tts uses 16k SR, and if you use mlx_audio.tts.generate with --play flag, it plays the generated audio in the default 24k, creating chipmunk effect. If I change the generate.py and delay initializing the player with the right sampling rate later, it works.
for i, result in enumerate(results):
if play:
if i==0:
# Load AudioPlayer
player = AudioPlayer(sample_rate=result.sample_rate)
Thanks!
Yes, this would be nice! It's not always present in the model config, so we may need a manual mapping for some models. Do you want to send a PR? 🙏
Thanks for opening this issue!
Yes, I agree.
It's one of the things I kept in mind coming into v0.2.0 since we now have lots of models with varying sampling rates.
Unfortunetly, it went to the backlog along side a redesign of our python APIs.