STT icon indicating copy to clipboard operation
STT copied to clipboard

Feature request: Support Apple Silicon M1 on macOS

Open Patronics opened this issue 3 years ago β€’ 21 comments

Is your feature request related to a problem? Please describe. It's been over a year since Apple released their M1 Macs, and since then they've released the M1 Pro and Max, which are very powerful for running tensorflow, but there's still no support for them implemented for Coqui.

Describe the solution you'd like Port Coqui to support running on Apple's M1 Series chips, and/or document the steps to build it manually.

Describe alternatives you've considered Running Deepspeech in x64 emulation mode with Rosetta doesn't seem to work, and Coqui hasn't added any commits to add support for Apple Silicon.

Additional context There's some previous discussion about it here and a list of issues here that would seem to mostly have been since resolved, such as tensor flow support.

Patronics avatar Nov 17 '21 07:11 Patronics

Does GitHub Actions support building Apple Silicon binaries? If so it should be relatively easy to get it up and running. No one in the team has the hardware so until that changes or someone donates a machine it'll be up to the community to provide not just answers but also pull requests.

reuben avatar Nov 17 '21 08:11 reuben

I have the following concerns about this.

  1. Why would someone train on a laptop?
  2. Isn't CUDA only available on Nvidia GPUs?
  3. Given the thermal related issues of those chips, is it really a good idea?

wasertech avatar Dec 08 '21 18:12 wasertech

This issue is about the STT inference package, or at least I've been operating under that assumption πŸ˜…. Training on M1 is out of scope for our project for the reasons you mentioned.

reuben avatar Dec 08 '21 18:12 reuben

I was under the wrong assumption then. Thanks RΓΌben.

wasertech avatar Dec 08 '21 22:12 wasertech

Looks like both libstt.so and the Python package build from source just fine, just need to figure out if we need any massaging of platform names to make things compatible, as the wheel gets created with the macosx_12_0_arm64 tag. I also notice that our Python 3.10 package got magically upgraded to a macosx_10_10_universal2 tag, but it doesn't work because the libstt.so embedded in it is not a universal binary:

$ pyenv install 3.10.2
python-build: use [email protected] from homebrew
python-build: use readline from homebrew
Downloading Python-3.10.2.tar.xz...
-> https://www.python.org/ftp/python/3.10.2/Python-3.10.2.tar.xz
Installing Python-3.10.2...
python-build: use readline from homebrew
python-build: use zlib from xcode sdk
Installed Python-3.10.2 to /Users/reuben/.pyenv/versions/3.10.2

$ pyenv virtualenv 3.10.2 test-stt
$ pyenv local test-stt
(test-stt)  $ pip install stt
Collecting stt
  Downloading stt-1.2.0-cp310-cp310-macosx_10_10_universal2.whl (3.1 MB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.1 MB 306 kB/s
Collecting numpy>=1.21.4
  Downloading numpy-1.22.2-cp310-cp310-macosx_11_0_arm64.whl (12.8 MB)
     |β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 12.8 MB 142 kB/s
Installing collected packages: numpy, stt
Successfully installed numpy-1.22.2 stt-1.2.0
(test-stt)  $ stt --model ~/Downloads/coqui-stt-1.1.0-english.tflite --scorer ~/Downloads/huge-vocabulary.scorer --audio ~/Development/STT/data/smoke_test/LDC93S1_pcms16le_1_16000.wav
Traceback (most recent call last):
  File "/Users/reuben/.pyenv/versions/test-stt/bin/stt", line 5, in <module>
    from stt.client import main
  File "/Users/reuben/.pyenv/versions/3.10.2/envs/test-stt/lib/python3.10/site-packages/stt/__init__.py", line 23, in <module>
    from stt.impl import Version as version
  File "/Users/reuben/.pyenv/versions/3.10.2/envs/test-stt/lib/python3.10/site-packages/stt/impl.py", line 13, in <module>
    from . import _impl
ImportError: dlopen(/Users/reuben/.pyenv/versions/3.10.2/envs/test-stt/lib/python3.10/site-packages/stt/_impl.cpython-310-darwin.so, 0x0002): tried: '/Users/reuben/.pyenv/versions/3.10.2/envs/test-stt/lib/python3.10/site-packages/stt/_impl.cpython-310-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e')), '/usr/local/lib/_impl.cpython-310-darwin.so' (no such file), '/usr/lib/_impl.cpython-310-darwin.so' (no such file)

So I guess building a universal libstt.so would automatically give us Apple Silicon support on 3.10.

reuben avatar Feb 08 '22 17:02 reuben

Cross building from an Intel mac with bazel build --config=macos_arm64 also seems to work fine, so we can handle that in CI. It seems newer bazel will automatically generate universal binaries if supported but TensorFlow 2.8 is not using that version yet, but maybe it works out of the box.

reuben avatar Feb 08 '22 17:02 reuben

I think I misunderstood the notes, it's only about toolchain binaries. Plus the build is broken on bazel 5.0.0. I guess the easiest way is to do two separate builds, one for x86_64 and one for arm64, and then manually collect the artifacts and call lipo to create the universal binary.

reuben avatar Feb 08 '22 17:02 reuben

So you have the machine now πŸ˜‚πŸ‘πŸ»

wasertech avatar Feb 09 '22 00:02 wasertech

Looks like choosing macos-min-version is not enough to select the appropriate version of the SDK on the macos-10.15 workers, need to figure out how to force it. Once that's working, we need to install arm64 version of dependencies linked against, maybe like this: https://github.com/Homebrew/discussions/discussions/2843#discussioncomment-2027526

reuben avatar Feb 09 '22 10:02 reuben

Aha, bazel ignores xcode-select, you have to pass e.g. --xcode_version 12.2 to specify which Xcode version you want to use when multiple are installed.

reuben avatar Feb 09 '22 14:02 reuben

PR #2100 is building and passing tests, we now have universal builds for libstt.so, libkenlm.so and the stt CLI client. The bindings need a bit more work: the SWIG bindings also need to be made universal, and for Node/Electron node-pre-gyp needs to be updated and made to handle the new architecture.

reuben avatar Feb 21 '22 14:02 reuben

Also the stt binary for arm64 is built without SoX enabled, because we need to find a way to install the arm64 version of SoX and its dependencies on an x86_64 macOS worker in CI, to link against them. Some tips were shared here but I still haven't been able to try them out: https://github.com/Homebrew/discussions/discussions/2843#discussioncomment-2142210

reuben avatar Feb 24 '22 10:02 reuben

Any progress made with this? I really want to use coqui stt for a project. I tried to follow the guide for building a binary but didn't work out on m1 macbook. If anyone could help me generate the binary on my m1 macbook that would be greatly appreciated as the docs didn't have anything about compiling on an m1 machine. Thank you.

I tried emulating with x86_64 either got bad instruction set or if I run it with arm it says mach o file arm64e needed.

Vrajs16 avatar May 17 '22 16:05 Vrajs16

Reading https://twitter.com/coqui_ai/status/1495830121466327043 I was super excited πŸ€— but I did NOT manage to get 🐸STT running directly on my M1 Mac.

Node.js user here, using the web_microphone_websocket example running node server.js fails with error:

Error: Cannot find module '/myprojectfolder/node_modules/stt/lib/binding/v1.3.0/darwin-arm64/node-v93/stt.node'

Weird enough (and hopefully that helps here 🀞), I did manage to run that Node.js server on my M1, but through a multipass Ubuntu virtual machine!

  • multipass launch --name coquistt
  • multipass mount /wherever/your/web_microphone_websocket/project/is coquistt (mount that path in the VM to share it between your Mac and that VM)
  • then multipass shell coquistt and everything below is within that VM shell
  • sudo snap install node --classic
  • sudo apt-get update && sudo apt-get install build-essential (that's for make used during yarn install)
  • yarn install now works fine

The uname -a on that VM displays Linux coquistt 5.4.0-109-generic #123-Ubuntu SMP Fri Apr 8 09:12:14 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux so I'm guessing something is messed up (or not ready yet?) on the 🐸STT build/publish process, as the darwin-arm64 folder doesn't exist as a binding folder, where linux-arm64 exists and works fine in that VM of mine.

πŸ‘‰ @reuben maybe the relevant library exists but Node.js bindings are not up to date just yet? Or I mis-interpreted your tweet and those bindings are just not ready πŸ™‚ I can help reproduce stuff on my M1 machine if that helps you πŸ€“


Tips & tricks for anyone ending up here and for whom the example still doesn't work:

  • start by downloading the model you are interested in there: https://coqui.ai/models
  • the server.js code expects to find 2 files for your model in web_microphone_websocket/coqui-stt-models.scorer and web_microphone_websocket/coqui-stt-models.tflite so move and rename the relevant downloaded model files
  • change localhost to 0.0.0.0 in the server.js file so that you can use it from your Mac while it runs in the VM
  • launch node server.js from the VM shell: from your Mac, it'll fail with above error, which was the whole point of the VM thing
  • change the io.connect uri in src/App.js to whatever IP multipass ls is giving you for your VM (in my case it was this.socket = io.connect('http://192.168.64.4:4000', {});)
  • launch yarn start from your Mac and NOT from the VM: navigator.getUserMedia only works through HTTPS (as in HTTP Secure), or localhost, and if you run it from the VM, you are now using 192.168.4 (see here)

clorichel avatar May 21 '22 16:05 clorichel

@reuben any news here? Given that tweet was in February, one would hope there would be some officially documented steps (if different than the normal installation method) to install and run it on an M1 Mac, ideally without the complexity of virtual machines as @clorichel described above.

Patronics avatar Jun 20 '22 01:06 Patronics

I may have been able to partially fix the issues on the M1, I managed to build the bindings for darwin-arm64 using the modifications here: https://github.com/phobosdpl/STT/tree/m1_js_bindings and following the instructions here: https://stt.readthedocs.io/en/latest/BUILDING.html on my M1 MacBook Pro.

After manually copying those into place on an existing install (I needed to copy the stt/lib/binding/v1.4.0-alpha.1/darwin-arm64/node-v93 file, and all the files in tensorflow/bazel-out/darwin_arm64-opt/bin/native_client), I get a new error, seemingly from node-vad:

/path/to/app/node_modules/bindings/bindings.js:121
        throw e;
        ^

Error: dlopen(/path/to/app/node_modules/node-vad/build/Release/vad.node, 0x0001): tried: '/path/to/app/node_modules/node-vad/build/Release/vad.node' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e'))
    at Object.Module._extensions..node (node:internal/modules/cjs/loader:1189:18)
    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Module.require (node:internal/modules/cjs/loader:1005:19)
    at require (node:internal/modules/cjs/helpers:102:18)
    at bindings (/path/to/app/node_modules/bindings/bindings.js:112:48)
    at Object.<anonymous> (/path/to/app/node_modules/node-vad/lib/vad.js:4:49)
    at Module._compile (node:internal/modules/cjs/loader:1105:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1159:10)
    at Module.load (node:internal/modules/cjs/loader:981:32) {
  code: 'ERR_DLOPEN_FAILED'
}

Patronics avatar Jun 21 '22 04:06 Patronics

I have no idea what, if anything, has changed but now I seem to be able to complete the process of producing the node bindings (with the branch described above), but attempting to run the app after running npm install on the stt-1.4.0-alpha.1.tgz file produced is giving the error

dyld[46873]: missing symbol called
zsh: abort      node scriptname.js

when hitting the line new stt.Model(modelPath);

Patronics avatar Aug 17 '22 06:08 Patronics

looks like its been another 7 months, wondering if anyone had any luck, or at least could make a list of problems, and some of us contributors could take a stab at fixing them?

mattkanwisher avatar Mar 29 '23 11:03 mattkanwisher

I was about to come here and post about my findings! I have sucessfully compiled and tested coqui for M1 macOS. I was only able to do so with a self-hosted runner in my own machine though, but it should be possible to adapt from there (as there is no M1 server available for coqui as far as I know). Will open a draft PR shortly, and we can find a solution that works for everyone from there :)

dsouza95 avatar Mar 29 '23 13:03 dsouza95

Just to keep you guys updated, had some cleaning up to do to keep x64 support and I have been experiencing several broken URLs on the automated builds lately. Fixed some on my own fork for now, but should take me a couple days more to get all small details sorted.

dsouza95 avatar Mar 30 '23 19:03 dsouza95

Opened a draft PR with the required changes to the build pipeline, but it requires a self-hosted ARM macOS runner. Not sure how this could be made viable, perhaps it would be easier to wait for the github ARM runners to be available. Regarless of that, it is rather simple to host a runner on your own Mac in case anyone wants to give that a try. It should also be possible to build the libraries locally, though I have not tested this way.

dsouza95 avatar Apr 02 '23 03:04 dsouza95