seamless_communication
seamless_communication copied to clipboard
Foundational Models for State-of-the-Art Speech and Text Translation
من بعد از تنظیم دقیق روی دیتای سفارشی سازی شده ای که خودم تولید کردم و بر اساس کلماتی که اشتباه می خواند و بعد از تنظیم دقیق بارگذاری فایل...
why??!! %cd /content/my_seamless_communication/src !torchrun \ --rdzv-backend=c10d \ --rdzv-endpoint=localhost:0 \ --nnodes=1 \ --nproc-per-node=1 \ /content/my_seamless_communication/src/finetune.py \ --mode SPEECH_TO_SPEECH \ --train_dataset /content/my_m4t_dataset/train_manifest.json \ --eval_dataset /content/my_m4t_dataset/validation_manifest.json \ --learning_rate 1e-6 \ --warmup_steps 100 \...
Dears, I can finetune with m4t_finetune in SPEECH_TO_TEXT mode successfullty. However, when I finetune in --mode TEXT_TO_SPEECH and SPEECH_TO_SPEECH, the script will throw error "NotImplementedError: T2U finetuning implemented only for...
when I run finetune,it tell me: Traceback (most recent call last): File "/opt/conda/bin/m4t_finetune", line 8, in sys.exit(main()) File "/opt/conda/lib/python3.10/site-packages/seamless_communication/cli/m4t/finetune/finetune.py", line 148, in main text_tokenizer = load_unity_text_tokenizer(args.model_name) File "/opt/conda/lib/python3.10/site-packages/fairseq2/models/utils/generic_loaders.py", line 353,...
(base) ➜ seamless_communication git:(main) ✗ m4t_predict ./input/test.mp3 --task s2st --tgt_lang fra --output_path ./output Traceback (most recent call last): File "/opt/homebrew/bin/m4t_predict", line 5, in from seamless_communication.cli.m4t.predict.predict import main File "/opt/homebrew/lib/python3.9/site-packages/seamless_communication/__init__.py", line...
A custom dataset was generated for the basis of the words that are read incorrectly and the v1 large model was trained on the text to voice and I received...
I want to build an API interface, but the translation after the audio input is wrong, I think I may have made a mistake when processing the audio file, may...
I want to teach the m4t v2 model, but I am facing various errors Has anyone done this? Is this possible now? Can you guide me?
m4t_predict --task S2ST --tgt_lang cmn --src_lang eng --model_name seamlessM4T_v2_large --output_path ./ddd.mp3 ./test.mp3 python 3.10 mac os 12.7.5 inter x86_64 (.venv) ➜ seamless_communication git:(main) ✗ m4t_predict --task S2ST --tgt_lang cmn --src_lang...
run: m4t_predict --task S2ST --tgt_lang cmn --src_lang eng --model_name seamlessM4T_v2_large --output_path "/Users/liuhao/ddd.mp3" ./test.mp3 save file : RuntimeError: torchaudio_sox::save_audio_file() Expected a value of type 'str' for argument '_0' but instead found...