Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

RuntimeError: makeDeviceForHostname(): unsupported gloo device

Open LacyCat opened this issue 3 months ago • 5 comments

Bug Description Training fails on single GPU systems with the following error: RuntimeError: makeDeviceForHostname(): unsupported gloo device

Environment GPU: NVIDIA GTX 1660 TI (single gpu) Python: 3.10 OS: Win11

FULL LOG

(rvc) G:\Retrieval-based-Voice-Conversion-WebUI>py infer-web.py
2025-08-31 16:44:09 | INFO | configs.config | Found GPU NVIDIA GeForce GTX 1660 Ti, force to fp32
2025-08-31 16:44:09 | INFO | configs.config | overwrite v1/32k.json
2025-08-31 16:44:09 | INFO | configs.config | overwrite v1/40k.json
2025-08-31 16:44:09 | INFO | configs.config | overwrite v1/48k.json
2025-08-31 16:44:09 | INFO | configs.config | overwrite v2/48k.json
2025-08-31 16:44:09 | INFO | configs.config | overwrite v2/32k.json
2025-08-31 16:44:09 | INFO | configs.config | overwrite preprocess_per to 3
2025-08-31 16:44:09 | INFO | configs.config | Half-precision floating-point: False, device: cuda:0
G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\gradio_client\documentation.py:106: UserWarning: Could not get documentation group for <class 'gradio.mix.Parallel'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\gradio_client\documentation.py:106: UserWarning: Could not get documentation group for <class 'gradio.mix.Series'>: No known documentation group for module 'gradio.mix'
  warnings.warn(f"Could not get documentation group for {cls}: {exc}")
2025-08-31 16:44:10 | INFO | __main__ | Use Language: ko_KR
Running on local URL:  http://0.0.0.0:7865
2025-08-31 16:44:20 | INFO | __main__ | Execute: "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\Scripts\python.exe" infer/modules/train/preprocess.py "G:/Clip/LacyCat/raw/" 40000 4 "G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat" False 3.0
G:/Clip/LacyCat/raw/ 40000 4 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False 3.0
start preprocess
G:/Clip/LacyCat/raw/ 40000 4 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False 3.0
G:/Clip/LacyCat/raw/ 40000 4 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False 3.0
G:/Clip/LacyCat/raw/ 40000 4 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False 3.0
G:/Clip/LacyCat/raw/ 40000 4 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False 3.0
G:/Clip/LacyCat/raw//output.wav -> Success
end preprocess
2025-08-31 16:44:30 | INFO | __main__ | start preprocess
G:/Clip/LacyCat/raw//output.wav -> Success
end preprocess

2025-08-31 16:44:34 | INFO | __main__ | Execute: "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\Scripts\python.exe" infer/modules/train/extract/extract_f0_rmvpe.py 2 0 0 "G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat" False
2025-08-31 16:44:34 | INFO | __main__ | Execute: "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\Scripts\python.exe" infer/modules/train/extract/extract_f0_rmvpe.py 2 1 0 "G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat" False
infer/modules/train/extract/extract_f0_rmvpe.py 2 0 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-11
f0ing,now-0,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_1.wav
f0ing,now-2,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_14.wav
f0ing,now-4,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_2.wav
f0ing,now-6,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_23.wav
f0ing,now-8,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_27.wav
f0ing,now-10,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_9.wav
infer/modules/train/extract/extract_f0_rmvpe.py 2 1 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-10
f0ing,now-0,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_10.wav
f0ing,now-2,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_16.wav
f0ing,now-4,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_20.wav
f0ing,now-6,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_24.wav
f0ing,now-8,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_4.wav
2025-08-31 16:44:36 | INFO | __main__ | infer/modules/train/extract/extract_f0_rmvpe.py 2 0 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-11
f0ing,now-0,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_1.wav
f0ing,now-2,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_14.wav
f0ing,now-4,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_2.wav
f0ing,now-6,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_23.wav
f0ing,now-8,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_27.wav
f0ing,now-10,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_9.wav
infer/modules/train/extract/extract_f0_rmvpe.py 2 1 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-10
f0ing,now-0,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_10.wav
f0ing,now-2,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_16.wav
f0ing,now-4,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_20.wav
f0ing,now-6,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_24.wav
f0ing,now-8,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_4.wav

2025-08-31 16:44:36 | INFO | __main__ | Execute: "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\Scripts\python.exe" infer/modules/train/extract_feature_print.py cuda:0 1 0 0 "G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat" v2 False
infer/modules/train/extract_feature_print.py cuda:0 1 0 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat v2 False
exp_dir: G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat
load model(s) from assets/hubert/hubert_base.pt
2025-08-31 16:44:39 | INFO | fairseq.tasks.hubert_pretraining | current directory is G:\Retrieval-based-Voice-Conversion-WebUI
2025-08-31 16:44:39 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False}
2025-08-31 16:44:39 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}
G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\torch\nn\utils\weight_norm.py:144: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
  WeightNorm.apply(module, name, dim)
move model to cuda
all-feature-21
all-feature-done
2025-08-31 16:44:43 | INFO | __main__ | infer/modules/train/extract/extract_f0_rmvpe.py 2 0 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-11
f0ing,now-0,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_1.wav
f0ing,now-2,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_14.wav
f0ing,now-4,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_2.wav
f0ing,now-6,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_23.wav
f0ing,now-8,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_27.wav
f0ing,now-10,all-11,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_9.wav
infer/modules/train/extract/extract_f0_rmvpe.py 2 1 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat False
todo-f0-10
f0ing,now-0,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_10.wav
f0ing,now-2,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_16.wav
f0ing,now-4,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_20.wav
f0ing,now-6,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_24.wav
f0ing,now-8,all-10,-G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat/1_16k_wavs/0_4.wav
infer/modules/train/extract_feature_print.py cuda:0 1 0 0 G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat v2 False
exp_dir: G:\Retrieval-based-Voice-Conversion-WebUI/logs/LacyCat
load model(s) from assets/hubert/hubert_base.pt
move model to cuda
all-feature-21
all-feature-done

2025-08-31 16:44:53 | INFO | __main__ | Use gpus: 0
2025-08-31 16:44:53 | INFO | __main__ | Execute: "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\Scripts\python.exe" infer/modules/train/train.py -e "LacyCat" -sr 40k -f0 1 -bs 3 -g 0 -te 20 -se 10 -pg assets/pretrained_v2/f0Ov2Super40kG.pth -pd assets/pretrained_v2/f0Ov2Super40kD.pth -l 0 -c 1 -sw 0 -v v2
INFO:LacyCat:{'data': {'filter_length': 2048, 'hop_length': 400, 'max_wav_value': 32768.0, 'mel_fmax': None, 'mel_fmin': 0.0, 'n_mel_channels': 125, 'sampling_rate': 40000, 'win_length': 2048, 'training_files': './logs\\LacyCat/filelist.txt'}, 'model': {'filter_channels': 768, 'gin_channels': 256, 'hidden_channels': 192, 'inter_channels': 192, 'kernel_size': 3, 'n_heads': 2, 'n_layers': 6, 'p_dropout': 0, 'resblock': '1', 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'resblock_kernel_sizes': [3, 7, 11], 'spk_embed_dim': 109, 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 4, 4], 'upsample_rates': [10, 10, 2, 2], 'use_spectral_norm': False}, 'train': {'batch_size': 3, 'betas': [0.8, 0.99], 'c_kl': 1.0, 'c_mel': 45, 'epochs': 20000, 'eps': 1e-09, 'fp16_run': False, 'init_lr_ratio': 1, 'learning_rate': 0.0001, 'log_interval': 200, 'lr_decay': 0.999875, 'seed': 1234, 'segment_size': 12800, 'warmup_epochs': 0}, 'model_dir': './logs\\LacyCat', 'experiment_dir': './logs\\LacyCat', 'save_every_epoch': 10, 'name': 'LacyCat', 'total_epoch': 20, 'pretrainG': 'assets/pretrained_v2/f0Ov2Super40kG.pth', 'pretrainD': 'assets/pretrained_v2/f0Ov2Super40kD.pth', 'version': 'v2', 'gpus': '0', 'sample_rate': '40k', 'if_f0': 1, 'if_latest': 0, 'save_every_weights': '0', 'if_cache_data_in_gpu': 1}
Process Process-1:
Traceback (most recent call last):
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "G:\Retrieval-based-Voice-Conversion-WebUI\infer\modules\train\train.py", line 129, in run
    dist.init_process_group(
  File "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\torch\distributed\c10d_logger.py", line 81, in wrapper
    return func(*args, **kwargs)
  File "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\torch\distributed\c10d_logger.py", line 95, in wrapper
    func_return = func(*args, **kwargs)
  File "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\torch\distributed\distributed_c10d.py", line 1764, in init_process_group
    default_pg, _ = _new_process_group_helper(
  File "G:\Retrieval-based-Voice-Conversion-WebUI\rvc\lib\site-packages\torch\distributed\distributed_c10d.py", line 1991, in _new_process_group_helper
    backend_class = ProcessGroupGloo(
RuntimeError: makeDeviceForHostname(): unsupported gloo device

LacyCat avatar Aug 31 '25 07:08 LacyCat

+1 having the same problem

marufmax avatar Sep 04 '25 14:09 marufmax

Y cuál sería la solución?

managerykuki-design avatar Sep 06 '25 18:09 managerykuki-design

https://github.com/pytorch/pytorch/issues/150381#issuecomment-3236080511 Try pytorch==2.7.1

zliu-aki avatar Sep 08 '25 13:09 zliu-aki

I was having this issue as well, not sure if this is from me using blackwell arch but here is what I did for a workaround:

In this script:

ultimate_rvc\rvc\train\train.py

I commented out this:

# Initialize distributed training environment for child node.
#dist.init_process_group(
#    backend="gloo" if sys.platform == "win32" or device.type != "cuda" else "nccl",
#    init_method="env://",
#    world_size=n_gpus if device.type == "cuda" else 1,
#    rank=rank if device.type == "cuda" else 0,
#)

And now it's working.

Hoppsss avatar Nov 05 '25 01:11 Hoppsss

pytorch/pytorch#150381 (comment) Try pytorch==2.7.1

That seems to be a workaround, have a look at:

https://github.com/pytorch/pytorch/issues/150381#issuecomment-2780178703

Try to downgrade pytorch

julianertle avatar Nov 07 '25 15:11 julianertle