STT
STT copied to clipboard
Feature request: support for linux/Aarch64 + gpu
If you have a feature request, then please provide the following information:
Is your feature request related to a problem? Please describe. We have an Aarch64 device (nvidia jetson agx) and would like to use its GPU cores as running non CUDA version of coqui tends to be too slow for real time inference on our platform as well as being too CPU intensive.
Describe the solution you'd like Add linux/Aarch64 + gpu to supported platforms
Describe alternatives you've considered We have tried Tflite models on these devices, they run but but tend be of worse quality, and at least with deepspeech, ran slower than CUDA enabled full scale models.
With deepspeech this project https://github.com/domcross/DeepSpeech-for-Jetson-Nano/releases, managed to port 0.9.x to a jetson nano and agx. Showing at least with Deepspeech code base before your fork this is possible. We have also built it ourselves for an older version of Deepspeech but the build process was tricky and likely beyond the scope for many teams.
We have also built it ourselves
Pull requests would be greatly appreciated :)
Will see what I can do. In the project I am working in device time is quite limited at the mo and If I remember previous attempts to cross compile deepspeech did not end well. That was for version 0.5 I think , maybe things will be better now with all your changes :)
Thank you! We have some cross compilation infra up and running for ARM already, maybe you can base it off of that. Some references:
https://github.com/coqui-ai/STT/blob/6f2c7a8a7baaa76f5f2c59fd956971deb4afed2c/.github/workflows/build-and-test.yml#L2183-L2253
https://github.com/coqui-ai/STT/blob/6f2c7a8a7baaa76f5f2c59fd956971deb4afed2c/ci_scripts/tf-vars.sh#L169-L170
Most of it is just flags to Bazel which the TF build system then handles on its own. But I'm sure CUDA complicates things. Let me know if there's any way I can help with getting the PR up!
Thanks for the flags, much appreciated. I have a long list of tasks to do before I get to this but I hope to give it a proper bash in the next 6 weeks or so.
Any updates on this? Was looking to use CUDA on a Jetson Nano but doesn't look like that's possible.