chatterbox
chatterbox copied to clipboard
Output channels > 65536 not supported at the MPS device (M1/M2/M3/M4 chip)
Hello, fellow neural hackers!
I'm amazed with chatterbox capabilities and would like to run it on my m1 8gb macbook, but i keep getting NotImplementedError: Output channels > 65536 not supported at the MPS device ` error, and the only way to generate is to fallback to 'cpu'.
I can't say this is slow, but if it can be done even 1.5x faster - i'd really like it to.
i got torch 2.7.1 and torchaudio 2.7.1 and macOS v15.0.1
Please help!
I had the same issue, I cloned the chatterbox repo and used the same python version as of the repo to create the venv for the cloned project, and it worked for me.
So, i installed python 3.11.0 with pyenv and updated macOS to 15.5 and now "mps" device works.
i execute it like that:
/Users/m1/.pyenv/versions/3.11.0/bin/python /Users/m1/chatterbox/example_for_mac.py
p.s. At first impression it was 1.5-2x slower, than before with "cpu", but that's cause my macbook air was overheated, after i let it cooldown now it takes 1 minute (3x faster) to generate a 2 sentence voice line. I really really dread os updates, but it was worth the risk.
edit: no it wasn't! 15.5 update f up my Parallels Desktop setup for playing Turtle WoW ;(
I dunno guys, after 1 day of testing i find it's better to stick to cpu on fanless macbooks. when macbook air goes into overheating 'mps' option could take 10minutes for a short answer, while 'cpu' could go all night long without slow down. overall 'cpu' is faster if you are doing multiple generations in a row
If you are doing some kind of audiobook or podcast, 'mps' will slow you down.
The issue is solved, just sharing my air m1 8gb experience :)
Also seeing this on an Apple M2 Max, 64GB RAM The exact steps I took:
# My QuickStart Instructions:
# ## WITH PYTHON 3.11 and VENV (NOT BINARY FILE):
# -----------------
# git clone https://github.com/resemble-ai/chatterbox.git chatterbox_tts_resembleai
# cd chatterbox_tts_resembleai/
#
# # Create a virtual environment
# python3.11 -m venv venv_py311
#
# # Activate virtual environment
# # On Windows:
# venv_py311\Scripts\activate
# # On macOS/Linux:
# source venv_py311/bin/activate
#
# # Install the package in development mode
# pip install --upgrade pip
# pip install -e .
# pip install torch torchaudio
#
# # Run the example (SUPER SLOW....this downloads > 3GB Safetensors files from Huggingface):
# python example_tts.py
#
import torchaudio as ta
import torch
from chatterbox.tts import ChatterboxTTS
# Automatically detect the best available device
if torch.cuda.is_available():
device = "cuda"
elif torch.backends.mps.is_available():
device = "mps"
else:
device = "cpu"
print(f"Using device: {device}")
model = ChatterboxTTS.from_pretrained(device=device)
text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."
# # Use the default voice that comes with Resemble.ai
wav = model.generate(text)
ta.save("test-1.wav", wav, model.sr)
# Clone a voice / synthesize with a different voice
# If you want to synthesize with a different voice, specify the audio prompt
# AUDIO_PROMPT_PATH = "test_audio_files/test_audio_harvard.wav"
# AUDIO_PROMPT_PATH = "test_audio_files/numbers_random_woman.wav"
AUDIO_PROMPT_PATH = "test_audio_files/peter_griffin_exag_1.0.wav"
wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH)
ta.save("test-2.wav", wav, model.sr)