GPT-SoVITS icon indicating copy to clipboard operation
GPT-SoVITS copied to clipboard

训练后的train.log为空

Open bendchen opened this issue 1 year ago • 1 comments

我训练了一个声音,发现训练后的日志内容为空,哪位帮看看这是什么问题,谢谢! 这是模型日志目录中的文件: 2-name2text.txt 5-wav32k eval events.out.tfevents.1727580550.t630.1180145.0 train.log 3-bert 6-name2semantic.tsv events.out.tfevents.1727580226.t630.1145688.0 logs_s1 4-cnhubert config.json events.out.tfevents.1727580386.t630.1175751.0 logs_s2 这是train.log的信息: $ ll train.log -rw-r--r--. 1 abc abc 0 Sep 29 11:23 train.log

bendchen avatar Sep 29 '24 03:09 bendchen

应提供有效的终端信息以定位问题.

SapphireLab avatar Sep 30 '24 13:09 SapphireLab

抱歉,是我没有粘相应的log信息。 "/home/usera/miniconda3/envs/GPTSoVits/bin/python" GPT_SoVITS/s2_train.py --config "/home/usera/gptsovits/GPT-SoVITS-20240821v2/TEMP/tmp_s2.json" INFO:yidao:{'train': {'log_interval': 100, 'eval_interval': 500, 'seed': 1234, 'epochs': 8, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 6, 'fp16_run': True, 'lr_decay': 0.999875, 'segment_size': 20480, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'text_low_lr_rate': 0.4, 'pretrained_s2G': 'GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth', 'pretrained_s2D': 'GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2D2333k.pth', 'if_save_latest': True, 'if_save_every_weights': True, 'save_every_epoch': 4, 'gpu_numbers': '0-1'}, 'data': {'max_wav_value': 32768.0, 'sampling_rate': 32000, 'filter_length': 2048, 'hop_length': 640, 'win_length': 2048, 'n_mel_channels': 128, 'mel_fmin': 0.0, 'mel_fmax': None, 'add_blank': True, 'n_speakers': 300, 'cleaned_text': True, 'exp_dir': 'logs/yidao'}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 8, 2, 2], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 512, 'semantic_frame_rate': '25hz', 'freeze_quantizer': True, 'version': 'v2'}, 's2_ckpt_dir': 'logs/yidao', 'content_module': 'cnhubert', 'save_weight_dir': 'SoVITS_weights_v2', 'name': 'yidao', 'version': 'v2', 'pretrain': None, 'resume_step': None} phoneme_data_len: 3 wav_data_len: 99 100%|████████████| 99/99 [00:00<00:00, 82078.69it/s] skipped_phone: 0 , skipped_dur: 0 total left: 99 phoneme_data_len: 3 wav_data_len: 99 100%|████████████| 99/99 [00:00<00:00, 80895.40it/s] skipped_phone: 0 , skipped_dur: 0 total left: 99 INFO:yidao:loaded pretrained GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2G2333k.pth <All keys matched successfully> <All keys matched successfully> INFO:yidao:loaded pretrained GPT_SoVITS/pretrained_models/gsv-v2final-pretrained/s2D2333k.pth <All keys matched successfully> [00:00<?, ?it/s]<All keys matched successfully> [00:00<?, ?it/s][W reducer.cpp:1346] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) [W reducer.cpp:1346] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration, which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator()) INFO:yidao:Train Epoch: 1 [0%] INFO:yidao:[2.7050821781158447, 1.8196076154708862, 10.560304641723633, 17.74567222595215, 0.0, 2.593526840209961, 0, 9.99875e-05] INFO:yidao:Saving model and optimizer state at iteration 4 to logs/yidao/logs_s2/G_233333333333.pth INFO:yidao:Saving model and optimizer state at iteration 4 to logs/yidao/logs_s2/D_233333333333.pth INFO:yidao:saving ckpt yidao_e4:Success. INFO:yidao:====> Epoch: 4 INFO:yidao:====> Epoch: 5 INFO:yidao:====> Epoch: 6 INFO:yidao:====> Epoch: 7 INFO:yidao:Saving model and optimizer state at iteration 8 to logs/yidao/logs_s2/G_233333333333.pth INFO:yidao:Saving model and optimizer state at iteration 8 to logs/yidao/logs_s2/D_233333333333.pth INFO:yidao:saving ckpt yidao_e8:Success. INFO:yidao:====> Epoch: 8 "/home/usera/miniconda3/envs/GPTSoVits/bin/python" GPT_SoVITS/s1_train.py --config_file "/home/usera/gptsovits/GPT-SoVITS-20240821v2/TEMP/tmp_s1.yaml" Seed set to 1234 Using 16bit Automatic Mixed Precision (AMP) GPU available: True (cuda), used: True TPU available: False, using: 0 TPU cores HPU available: False, using: 0 HPUs <All keys matched successfully> ckpt_path: None Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/2 [rank: 1] Seed set to 1234 <All keys matched successfully> ckpt_path: None Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/2 distributed_backend=nccl All distributed processes registered. Starting with 2 processes semantic_data_len: 3 phoneme_data_len: 3 semantic_data_len: 3 phoneme_data_len: 3 item_name semantic_audio 0 vo_YMLQ002_7_kazumichi_01.wav 1012 298 93 12 857 567 316 851 497 869 692 513... 1 vo_YMLQ002_7_kazumichi_02.wav 582 239 239 247 997 571 966 894 408 112 769 47... 2 vo_YMLQ002_7_kazumichi_03.wav 1012 403 868 414 266 740 980 733 995 570 162 9... dataset.len(): 99 item_name semantic_audio 0 vo_YMLQ002_7_kazumichi_01.wav 1012 298 93 12 857 567 316 851 497 869 692 513... 1 vo_YMLQ002_7_kazumichi_02.wav 582 239 239 247 997 571 966 894 408 112 769 47... 2 vo_YMLQ002_7_kazumichi_03.wav 1012 403 868 414 266 740 980 733 995 570 162 9... dataset.len(): 99 LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

| Name | Type | Params | Mode

0 | model | Text2SemanticDecoder | 77.6 M | train

77.6 M Trainable params 0 Non-trainable params 77.6 M Total params 310.426 Total estimated model params size (MB) 257 Modules in train mode 0 Modules in eval mode /home/usera/miniconda3/envs/GPTSoVits/lib/python3.9/site-packages/pytorch_lightning/loops/fit_loop.py:298: The number of training batches (9) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch. Epoch 14: 100%|█| 9/9 [00:01<00:00, 7.80it/s, v_num=0, total_loss_step=130.0, lr_step=0.002, top_3_acc_step=1.000, total_loss_epoch=234.0, lr_epTrainer.fit stopped: max_epochs=15 reached. Epoch 14: 100%|█| 9/9 [00:04<00:00, 1.83it/s, v_num=0, total_loss_step=130.0, lr_step=0.002, top_3_acc_step=1.000, total_loss_epoch=234.0, lr_ep

从日志中看到可能有关的是:fit_loop.py:298: The number of training batches (9) is smaller than the logging interval Trainer(log_every_n_steps=50) 但不知道怎么调整程序

bendchen avatar Oct 03 '24 00:10 bendchen

logs/{实验名}/train.log 是训练 SoVITS 部分的日志 s2_train.py. 而在 PR #1422 中 GPT_SoVITS\utils.pyget_logger() 函数日志级别设置为了 ERROR,所以一般的 INFO 级别信息不会被保存. 级别: DEBUG < INFO < WARNING < ERROR < CRITICAL

所以如果你需要这部分日志,

  1. 将 logger.setLevel 改成最低级别的 logging.DEBUG. 即下图中从右到左还原即可.
  2. 更改频率可以修改 GPT_SoVITS\configs\s2.json 中的 train.log_interval.

image

当然更推荐你训练后使用 tensorborad --logdir="logs/{实验名}" 来查看训练记录.

SapphireLab avatar Oct 03 '24 10:10 SapphireLab

好的,谢谢

bendchen avatar Oct 04 '24 12:10 bendchen