StyleTTS2 icon indicating copy to clipboard operation
StyleTTS2 copied to clipboard

Mac (Metal) support?

Open itsPreto opened this issue 1 year ago • 33 comments

Any chances of running this model on the unified RAM in silicon macs? 16GB GPU/CPU

itsPreto avatar Nov 30 '23 00:11 itsPreto

Same issue. It doesn't work for now. I've tried HuggingFace space running locally and got:

MPS would be available but cannot be used rn RuntimeError: espeak not installed on your system

yukiarimo avatar Dec 05 '23 05:12 yukiarimo

@yukiarimo If you just want to do inference, you can install espeak via homebrew: https://formulae.brew.sh/formula/espeak

yl4579 avatar Dec 05 '23 08:12 yl4579

@yl4579 I have tried and got this error:

(ai) yuki@yuki styletts2 % python app.py
NLTK
[nltk_data] Downloading package punkt to /Users/yuki/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
SCIPY
TORCH STUFF
START
177
MPS would be available but cannot be used rn
/Users/yuki/anaconda3/envs/ai/lib/python3.10/site-packages/torch/nn/modules/rnn.py:71: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
  warnings.warn("dropout option adds dropout after all but last "
bert loaded
bert_encoder loaded
predictor loaded
decoder loaded
text_encoder loaded
predictor_encoder loaded
style_encoder loaded
diffusion loaded
text_aligner loaded
pitch_extractor loaded
mpd loaded
msd loaded
wd loaded
[nltk_data] Downloading package punkt to /Users/yuki/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
177
bert loaded
bert_encoder loaded
predictor loaded
decoder loaded
text_encoder loaded
predictor_encoder loaded
style_encoder loaded
diffusion loaded
text_aligner loaded
pitch_extractor loaded
mpd loaded
msd loaded
wd loaded
Traceback (most recent call last):
  File "/Users/yuki/Downloads/styletts2/app.py", line 105, in <module>
    btn.click(synthesize, inputs=[inp, voice, multispeakersteps], outputs=[audio], concurrency_limit=4)
TypeError: EventListenerMethod.__call__() got an unexpected keyword argument 'concurrency_limit'

yukiarimo avatar Dec 07 '23 06:12 yukiarimo

I didn't know this "app.py". Maybe you can ask it in the repo that made this?

yl4579 avatar Dec 07 '23 06:12 yl4579

https://huggingface.co/spaces/styletts2/styletts2/tree/main

yukiarimo avatar Dec 07 '23 07:12 yukiarimo

@yukiarimo Please ask @fakerybakery

yl4579 avatar Dec 07 '23 08:12 yl4579

@fakerybakery have you looked into mlx? It's a new framework from Apple. They have a separate repo for examples.

itsPreto avatar Dec 07 '23 19:12 itsPreto

They've designed it to closely follow PyTorch's implementation-- though I'm not sure exactly what this means in terms of interop. Still worth some attention!

itsPreto avatar Dec 07 '23 19:12 itsPreto

@fakerybakery I've tried your tutorial, but I am getting the error: "espeak is not found on your system". Any ideas?

yukiarimo avatar Dec 09 '23 04:12 yukiarimo

Hi, just forgot yes I had a similar issue. I’ll check my env to see what I did to fix it and get back to you. Sorry about the delay!

fakerybakery avatar Dec 10 '23 00:12 fakerybakery

@yukiarimo Did you successfully install the espeak-ng in with MacPorts? Can you try running:

echo 'this is a test' | espeak-ng -x -q --ipa -v en-us

fakerybakery avatar Dec 10 '23 00:12 fakerybakery

@fakerybakery Yes, it's working

Output: ðɪs ɪz ɐ tˈɛst

yukiarimo avatar Dec 10 '23 03:12 yukiarimo

Can you try running brew install espeak?

fakerybakery avatar Dec 10 '23 20:12 fakerybakery

Also, try setting the PHONEMIZER_ESPEAK_PATH env variable to the path of your espeak-ng installation (not the binary, the installation) and PHONEMIZER_ESPEAK_LIBRARY to the binary

If you don't know the installation path, try setting PHONEMIZER_ESPEAK_LIBRARY=/opt/local/bin/espeak-ng

fakerybakery avatar Dec 10 '23 20:12 fakerybakery

Hi @yukiarimo!

Part 1: Installing espeak-ng

First, espeak on mac is a bit tricky to install and get working with phonemizer. Here's how I got it working:

  1. Install MacPorts (Brew is better, but doesn't work w/ espeak-ng)

  2. Install espeak-ng through MacPorts: sudo port install espeak-ng

  3. Phonemizer will give an error about phontab or missing data. You can resolve this by:

    1. Opening the ~/.zshrc file
    2. Adding export ESPEAK_DATA_PATH="/opt/local/share/espeak-ng-data" to the end of the file on a separate line
  4. Now Phonemizer should work on Mac.

Part 2: Resolving concurrency_limit

The issue with concurrency_limit is actually not an issue with MPS/Metal. It's an issue with the Web UI framework used for this demo, Gradio. Try running pip install -U gradio

Part 3: About Metal/MPS

I tried modifying StyleTTS 2 to work with mps a couple weeks ago. It didn't work. PyTorch does not yet have full support for MPS so some features StyleTTS 2 required are still unavailable on MPS for PyTorch.

However, StyleTTS 2 is so fast that you don't really need MPS support. Even on CPU, it only takes a few seconds to generate relatively long text.

I hope PyTorch adds these features soon, however it currently looks like it will be a while before they're available.

Also, sorry about the weird messages ("MPS would be available but cannot be used rn", "torch stuff", etc) - I was testing the code and forgot to remove the weird notes that probably makes everything confusing.

Great rundown of the MPS and the related gradio web UI issue @fakerybakery! 🙏🏽 🙌🏽

It sounds like inference is sufficient for the time being then. @itsPreto makes a great suggestion with mlx. However, It is a fair amount of work to port over to it! 🤔

ve-varun-sharma avatar Dec 17 '23 19:12 ve-varun-sharma

I do have the same problem here. Installed espeak-ng through macports set the two environment variables like you said: PHONEMIZER_ESPEAK_LIBRARY=/opt/local/bin/espeak-ng PHONEMIZER_ESPEAK_PATH=/opt/local/share/espeak-ng-data and still the app.py errors out with "RuntimeError: espeak not installed on your system" any other suggestions?

mchack23 avatar Dec 18 '23 20:12 mchack23

Hi, Sorry, I’m out of ideas on this issue :) - Could you try opening an issue on the Phonemizer library?

fakerybakery avatar Dec 18 '23 20:12 fakerybakery

Hi there. I was playing with this a bit too and had similar issues. This allowed me to work around the espeak issue and run on mps/m1:

PHONEMIZER_ESPEAK_LIBRARY=/opt/homebrew/Cellar/espeak/1.48.04_1/lib/libespeak.dylib python3 styletts2_demo_libritts.py

I also used this command to locate my espeak installation: otool -L $(which espeak) | grep espeak.

These insights came from reading the issue here.

By the way, I also had to change some lines like ref_tokens = torch.LongTensor(ref_tokens).to(device).unsqueeze(0) to ref_tokens = torch.LongTensor(ref_tokens).unsqueeze(0). These issues are more straightforward than the espeak one.

mparrett avatar Dec 19 '23 00:12 mparrett

Hi @mparrett, what issues did you run in to that required you to change this? Just curious, since it seemed to work when I ran it on an M1 Mac.

By the way, I also had to change some lines like ref_tokens = torch.LongTensor(ref_tokens).to(device).unsqueeze(0) to ref_tokens = torch.LongTensor(ref_tokens).unsqueeze(0). These issues are more straightforward than the espeak one.

fakerybakery avatar Dec 19 '23 01:12 fakerybakery

@fakerybakery Did you set device = 'mps' with no errors? Otherwise, the example notebook will probably run as-is because it selects cpu if cuda is not available. For me, after setting the device to mps I first ran into some unsupported operation and had to set PYTORCH_ENABLE_MPS_FALLBACK=1. Then there was another problem with the text encoder model. I decided to exclude it from using the mps device by modifying this line:

_ = [model[key].to(device) for key in model]

After that I had to make some changes, being careful that input tensors were going on the correct device depending on the model(s) being used, which is the reason for the change I mentioned before.

I noticed a significant speedup (~3s mps vs ~5s cpu inference) but this was a quick hack and could probably be optimized much more for the mac hardware.

mparrett avatar Dec 19 '23 01:12 mparrett

Oh, yes, MPS isn't supported yet. I ran it on CPU. Thanks for the tips!

fakerybakery avatar Dec 19 '23 01:12 fakerybakery

I do have the same problem here. Installed espeak-ng through macports set the two environment variables like you said: PHONEMIZER_ESPEAK_LIBRARY=/opt/local/bin/espeak-ng PHONEMIZER_ESPEAK_PATH=/opt/local/share/espeak-ng-data and still the app.py errors out with "RuntimeError: espeak not installed on your system" any other suggestions?

I am having the same issue.

SkyViz avatar Dec 19 '23 08:12 SkyViz

I do have the same problem here. Installed espeak-ng through macports set the two environment variables like you said: PHONEMIZER_ESPEAK_LIBRARY=/opt/local/bin/espeak-ng PHONEMIZER_ESPEAK_PATH=/opt/local/share/espeak-ng-data and still the app.py errors out with "RuntimeError: espeak not installed on your system" any other suggestions?

So I got it running in the end. The problem seems to have been declaring the environment variables without using export (they showed up through echo, but probably weren't available for python?)

So installing espeak-ng through macports and setting these environment variables made it work:

export PHONEMIZER_ESPEAK_LIBRARY=/opt/local/lib/libespeak-ng.dylib
export PHONEMIZER_ESPEAK_PATH=/opt/local/bin/espeak-ng

sorry for the confusion. my mistake.

mchack23 avatar Dec 29 '23 19:12 mchack23

@mparrett Could you clarify the changes you made, or share your code regarding the mps device. (If I'm understanding you, you've successfully run using mps. Correct?)

Cheers.

changeling avatar Jul 27 '24 15:07 changeling

@mparrett Could you clarify the changes you made, or share your code regarding the mps device. (If I'm understanding you, you've successfully run using mps. Correct?)

Cheers.

That's right, I got this running with device == 'mps'. Happy to share my branch when I get a moment this weekend. Cheers.

mparrett avatar Jul 27 '24 16:07 mparrett

@mparrett Could you clarify the changes you made, or share your code regarding the mps device. (If I'm understanding you, you've successfully run using mps. Correct?) Cheers.

That's right, I got this running with device == 'mps'. Happy to share my branch when I get a moment this weekend. Cheers.

Look forward to that! Thank you!

changeling avatar Jul 27 '24 17:07 changeling

@mparrett Just checking in re: your mps implementation. Would it be easier just to enumerate the changes here in a comment, or just share a diff?

changeling avatar Aug 04 '24 14:08 changeling

Apologies for the delay, have been out of town. Getting back tomorrow and will share my branch. Thanks!On Aug 4, 2024, at 07:58, changeling @.***> wrote: @mparrett Just checking in re: your mps implementation. Would it be easier just to enumerate the changes here in a comment, or just share a diff?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

mparrett avatar Aug 05 '24 05:08 mparrett

No worries! Just looking forward to seeing your changes. Take your time. 😀

changeling avatar Aug 06 '24 13:08 changeling

Hi there @changeling ,

Finally got around to pushing these changes. My local repo is a bit of a mess and this wasn't intended for wider consumption. I prepared 3 branches for you to take a look at, with varying levels of granularity and noise :-). I left these un-rebased in case that matters to the final outcome, but they can be rebased without conflicts. Let me know if you have any questions!

https://github.com/mparrett/StyleTTS2/tree/matt-mps-squash https://github.com/mparrett/StyleTTS2/tree/matt-mps-squash-partial https://github.com/mparrett/StyleTTS2/tree/matt-mps

mparrett avatar Aug 07 '24 05:08 mparrett