torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGKILL for textless speech to speech translation

Open AI4VTV opened this issue 2 years ago • 1 comments

🐛 Bug

Doing Model Inferencing using below command

python examples/speech_recognition/new/infer.py --config-dir examples/hubert/config/decode/ --config-name infer_viterbi task.data=DATA_DIR task.normalize=false common_eval.results_path=RESULTS_PATH/log common_eval.path=DATA_DIR/en_10min/checkpoint_best.pt dataset.gen_subset=PD '+task.labels=["unit"]' +decoding.results_path=RESULTS_PATH common_eval.post_process=none +dataset.batch_size=1 common_eval.quiet=True

getting error of

Traceback (most recent call last): File "examples/speech_recognition/new/infer.py", line 438, in hydra_main distributed_utils.call_main(cfg, main) File "/opt/conda/lib/python3.7/site-packages/fairseq/distributed/utils.py", line 351, in call_main join=True, File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes while not context.join(): File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 146, in join signal_name=name torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with signal SIGKILL

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

also before this same message gets repeated

INFO:fairseq.tasks.hubert_pretraining:HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': '/large_experiments/ust/annl/data/accent/voxpopuli_en/hubert/it3_layer11_km1000', 'fine_tuning': True, 'labels': ['unit'], 'label_dir': '/large_experiments/ust/annl/data/accent/voxpopuli_en/hubert/it3_layer11_km1000', 'label_rate': -1.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': None, 'min_sample_size': None, 'single_target': True, 'random_crop': True, 'pad_audio': False} INFO:fairseq.tasks.hubert_pretraining:current directory is /home/jovyan/fairseq/RESULTS_PATH/log/viterbi INFO:fairseq.tasks.hubert_pretraining:HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': '/large_experiments/ust/annl/data/accent/voxpopuli_en/hubert/it3_layer11_km1000', 'fine_tuning': True, 'labels': ['unit'], 'label_dir': '/large_experiments/ust/annl/data/accent/voxpopuli_en/hubert/it3_layer11_km1000', 'label_rate': -1.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': None, 'min_sample_size': None, 'single_target': True, 'random_crop': True, 'pad_audio': False}

Environment

fairseq Version (e.g., 1.0 or main): 0.11.1 /0.12.2
PyTorch Version (e.g., 1.0) 1.12.0 / 1.13.1
OS (e.g., Linux): Linux
Python version: 3.7 / 3.8
CUDA/cuDNN version: 11.4

Mar 23 '23 16:03 AI4VTV

Do you have some progress? I meet the same error. @AI4VTV

Apr 16 '24 21:04 Charlottehoo