GPT-SoVITS icon indicating copy to clipboard operation
GPT-SoVITS copied to clipboard

eRuntimeError: unmatched '}' in format string

Open c469591 opened this issue 1 year ago • 3 comments

你好,我在點擊开启GPT训练之後出現了以下錯誤,請問該如何解決? 點擊开启SoVITS训练是正常的 我是windows10下運行完整包

"runtime\python" GPT_SoVITS/s1_train.py --config_file "TEMP/tmp_s1.yaml"
Seed set to 1234
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
<All keys matched successfully>
ckpt_path: None
[rank: 0] Seed set to 1234
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
Traceback (most recent call last):
  File "I:\GPT-SoVITS\GPT-SoVITS\GPT_SoVITS\s1_train.py", line 138, in <module>
    main(args)
  File "I:\GPT-SoVITS\GPT-SoVITS\GPT_SoVITS\s1_train.py", line 115, in main
    trainer.fit(model, data_module, ckpt_path=ckpt_path)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\call.py", line 43, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 102, in launch
    return function(*args, **kwargs)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 947, in _run
    self.strategy.setup_environment()
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 148, in setup_environment
    self.setup_distributed()
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 199, in setup_distributed
    _init_dist_connection(self.cluster_environment, self._process_group_backend, timeout=self._timeout)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\lightning_fabric\utilities\distributed.py", line 290, in _init_dist_connection
    torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, **kwargs)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\distributed_c10d.py", line 888, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 245, in _env_rendezvous_handler
    store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout)
  File "I:\GPT-SoVITS\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 176, in _create_c10d_store
    return TCPStore(
RuntimeError: unmatched '}' in format string

c469591 avatar Jan 17 '24 12:01 c469591

我找到暫時的解法了 打開GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py 找到這行start_daemon = rank == 0大約在175行 下方增加一行hostname = "localhost"就可以了

light1943 avatar Jan 30 '24 11:01 light1943

感谢!

c469591 avatar Jan 31 '24 23:01 c469591

我找到暫時的解法了 打開GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py 找到這行start_daemon = rank == 0大約在175行 下方增加一行hostname = "localhost"就可以了

是改成這樣嗎? 我的還是不行 start_daemon = rank == 0 hostname = "localhost" return TCPStore( hostname, port, world_size, start_daemon, timeout, multi_tenant=True )

fgod999 avatar Feb 02 '24 19:02 fgod999

https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b 理论上该commit修复了楼主的问题。如果还不行试试楼上添加hostname = "localhost"的方法。

RVC-Boss avatar Feb 08 '24 13:02 RVC-Boss

已经可以顺利gpt训练了,非常感谢

c469591 avatar Feb 26 '24 13:02 c469591