GPT-SoVITS 整合包訓練GPT時出現錯誤:RuntimeError: unmatched '}' in format string

D:\GPT-SoVITS>runtime\python.exe webui.py
Running on local URL:  http://0.0.0.0:9874
"D:\GPT-SoVITS\runtime\python.exe" GPT_SoVITS/s1_train.py --config_file "TEMP/tmp_s1.yaml"
Seed set to 1234
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
<All keys matched successfully>
ckpt_path: None
[rank: 0] Seed set to 1234
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
Traceback (most recent call last):
  File "D:\GPT-SoVITS\GPT_SoVITS\s1_train.py", line 171, in <module>
    main(args)
  File "D:\GPT-SoVITS\GPT_SoVITS\s1_train.py", line 147, in main
    trainer.fit(model, data_module, ckpt_path=ckpt_path)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\call.py", line 43, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 102, in launch
    return function(*args, **kwargs)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 947, in _run
    self.strategy.setup_environment()
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 148, in setup_environment
    self.setup_distributed()
  File "D:\GPT-SoVITS\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 199, in setup_distributed
    _init_dist_connection(self.cluster_environment, self._process_group_backend, timeout=self._timeout)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\lightning_fabric\utilities\distributed.py", line 290, in _init_dist_connection
    torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, **kwargs)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\distributed_c10d.py", line 888, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 245, in _env_rendezvous_handler
    store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout)
  File "D:\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 176, in _create_c10d_store
    return TCPStore(
RuntimeError: unmatched '}' in format string

Jan 18 '24 03:01 win10ogod

我也一样，过了很久说超时，batch_size为3，其他默认，询问gpt，检查了端口并无占用，请问怎么解决

文件结构如下

temp_s1.yaml文件如下

Jan 19 '24 07:01 PLL-L

same problem. OS: Win11, CUDA 12.1

Jan 23 '24 18:01 ioritree

I am facing the same issue as well. 一樣的問題 os : win11 torch 2.1.2+cu118 torchaudio 2.0.1+cu118 torchmetrics 1.3.0.post0 torchvision 0.15.1+cu118

If anyone has insights or solutions, I would greatly appreciate the help. Thank you! 如果有人有見解或解決方案，我將非常感激。謝謝！

Jan 25 '24 15:01 jx06T

同樣問題 OS: Win11, CUDA 11.8

以下組合都試過問題還是無法解決 Python 3.9, Python 3.10 PyTorch 2.0.1, PyTorch 2.1.2

Jan 26 '24 09:01 light1943

我找到暫時的解法了打開GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py 找到這行start_daemon = rank == 0大約在175行下方增加一行hostname = "localhost"就可以了

Jan 30 '24 11:01 light1943

我找到暫時的解法了打開GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py 找到這行start_daemon = rank == 0大約在175行下方增加一行hostname = "localhost"就可以了

thanks ,work good.

Feb 01 '24 05:02 ioritree

非常感謝!! 可以用!

Feb 01 '24 07:02 DruidTin

我找到暫時的解法了打開GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py 找到這行start_daemon = rank == 0大約在175行下方增加一行hostname = "localhost"就可以了

File "H:\SDAI\GPT-SoVITS\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 177 return TCPStore( hostname, port, world_size, start_daemon, timeout, multi_tenant=True) IndentationError: unexpected indent 已加入，出現另一個報錯

Feb 03 '24 15:02 fgod999

是不是沒有縮排？增加的那行開頭要對齊上一行start_daemon = rank == 0

Feb 04 '24 07:02 light1943

（遇到这个问题的大家@win10ogod @light1943 ）你们ping 127.0.0.1和ping localhost是同样的结果吗？看来是地址只能写localhost而不能写127.0.0.1导致的？

Feb 08 '24 13:02 RVC-Boss

https://github.com/RVC-Boss/GPT-SoVITS/commit/59f35adad85815df27e9c6b33d420f5ebfd8376b 理论上该commit修复了楼主的问题。如果还不行试试楼上添加hostname = "localhost"的方法。

Feb 08 '24 13:02 RVC-Boss

"C:\GPT-SoVITS-beta\runtime\python.exe" GPT_SoVITS/s1_train.py --config_file "C:\GPT-SoVITS-beta\TEMP/tmp_s1.yaml" Seed set to 1234 Using 16bit Automatic Mixed Precision (AMP) GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs <All keys matched successfully> ckpt_path: None [rank: 0] Seed set to 1234 Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1 Traceback (most recent call last): File "C:\GPT-SoVITS-beta\GPT_SoVITS\s1_train.py", line 170, in main(args) File "C:\GPT-SoVITS-beta\GPT_SoVITS\s1_train.py", line 146, in main trainer.fit(model, data_module, ckpt_path=ckpt_path) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 544, in fit call._call_and_handle_interrupt( File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\trainer\call.py", line 43, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\strategies\launchers\subprocess_script.py", line 102, in launch return function(*args, **kwargs) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 580, in _fit_impl self._run(model, ckpt_path=ckpt_path) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 947, in _run self.strategy.setup_environment() File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 148, in setup_environment self.setup_distributed() File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\pytorch_lightning\strategies\ddp.py", line 199, in setup_distributed _init_dist_connection(self.cluster_environment, self._process_group_backend, timeout=self._timeout) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\lightning_fabric\utilities\distributed.py", line 290, in _init_dist_connection torch.distributed.init_process_group(torch_distributed_backend, rank=global_rank, world_size=world_size, **kwargs) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\torch\distributed\distributed_c10d.py", line 888, in init_process_group store, rank, world_size = next(rendezvous_iterator) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 245, in _env_rendezvous_handler store = _create_c10d_store(master_addr, master_port, rank, world_size, timeout) File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 176, in _create_c10d_store return TCPStore( RuntimeError: unmatched '}' in format string

請問這個該怎麼解決，我按照樓主的方式處理過了，但依舊報錯，還請幫幫忙

Feb 11 '24 15:02 mohancheng

其實我用這方法依然無解但是目前最新的版本就修正了去更新版本吧

Feb 12 '24 02:02 fgod999

（遇到这个问题的大家@win10ogod @light1943 ）你们ping 127.0.0.1和ping localhost是同样的结果吗？看来是地址只能写localhost而不能写127.0.0.1导致的？

是的localhost就是127.0.0.1，但在這裡地址只能寫localhost，寫IP 127.0.0.1會出錯。這是torch的一個奇怪bug，且似乎只在Windows環境下出現。改成localhost能解是最近在其他地方有人找到的解法。

新的版本59f35ad已經修復這個問題，不需要修改torch的rendezvous.py了。感謝幫忙！

Feb 13 '24 12:02 light1943

吧

我一直在更行版本，但这问题依旧存在，依旧跟我上面遇到的报错一模一样，我真的不知道哪里出错

Feb 14 '24 16:02 mohancheng

我一直在更行版本，但这问题依旧存在，依旧跟我上面遇到的报错一模一样，我真的不知道哪里出错

如果你照我的方式處理過了，報错應該會變成line 177而不是line 176，可以檢查是不是改错檔案了。如果改對了，也更新到59f35ad之後的版本還是報錯，那可能就要在自己找解法了。畢竟torch這個奇怪的問題也沒有人真正的去解析它，只知道在某些環境下hostname不吃ip。

File "C:\GPT-SoVITS-beta\runtime\lib\site-packages\torch\distributed\rendezvous.py", line 176, in _create_c10d_store return TCPStore( RuntimeError: unmatched '}' in format string

Feb 15 '24 01:02 light1943

GPT-SoVITS GPT-SoVITS copied to clipboard

整合包訓練GPT時出現錯誤:RuntimeError: unmatched '}' in format string

GPT-SoVITS
GPT-SoVITS copied to clipboard