seamless_communication
seamless_communication copied to clipboard
Foundational Models for State-of-the-Art Speech and Text Translation
When I build unity with the `-DGGML_BUILD_EXAMPLES` option, the compiler issues an error. ``` /workspaces/seamless_communication/ggml/examples/common.h:44:58: error: 'unity_params' does not name a type; did you mean 'gpt_params'? 44 | void unity_print_usage(int...
I assumed unity.cpp would be on feature parity with the original engine, but looks like it only generates the translated text, and not the audio. Is this something that will...
Apparently, the dependency/install requirement of fairseq2n is not supported in Windows per (https://github.com/facebookresearch/fairseq2/issues/170) Errors include: ERROR: Could not find a version that satisfies the requirement fairseq2n (from versions: none) ERROR:...
The speech output is completely off when running on CPU, when I run the same code on GPU it works fine. Translation are not correct and speech output is off
./unity --model seamlessM4T_medium.ggml little-endian open_ggml_file: loading model from 'seamlessM4T_medium.ggml' load_model_weights: model size: 6542.93 MB, memory used: 6543.31 MB, memory reserved: 6543.31 MB Enter audio_path and tgt_lang, separated by space (or...
from the readme: expressivity_predict --tgt_lang --model_name seamless_expressivity --vocoder_name vocoder_pretssel --output--path the correct flag is --output_path, not --output--path running this also results in: (wslenv) martin@Desktop:/mnt/c/users/shkre/code/seamless/seamless_communication$ expressivity_predict 7.wav --tgt_lang spa --model_name seamless_expressivity...
Is it possible to modify the resulting translated text before it's transformed to audio? I want to S2ST with expressions, but if the translation doesn't quite match the context or...
updated installation to include WSL as an alternative solution for windows users.
How to get source language detection. This should be similar to [detect_language function in whisper](https://github.com/openai/whisper/blob/main/README.md#:~:text=Below%20is%20an%20example%20usage%20of%20whisper.detect_language()%20and%20whisper.decode()%20which%20provide%20lower%2Dlevel%20access%20to%20the%20model.) Reason: When using a chatbot I want to automatically detect the source language and provide...