Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

MacOS Apple M1无法训练won't complete the the training process

Open eisneim opened this issue 1 year ago • 4 comments

所有依赖都装好了,前两部:预处理,提取特征,得到3_feature256文件夹都没有问题,但是到了真正训练的时候就一瞬间完成了,直接跳过了,batch_size for every GPU: 1 各种参数都试过都不行,电机训练模型,一点击基本上就显示完成

step 1: processing data

...这里省略

step 3a: model traning started
write filelist done
python train_nsf_sim_cache_sid_load_pretrain.py -e eisneim -sr 40k -f0 1 -bs 1 -g 0 -te 100 -se 5 -pg pretrained/f0G40k.pth -pd pretrained/f0D40k.pth -l 0 -c 0 -sw 1 -v v1
Training complete. Logs are available in the console, or the 'train.log' under experiment folder
(50590, 256),1297
training index
adding index
成功构建索引, added_IVF1297_Flat_nprobe_1_eisneim_v1.index
all processes have been completed!

log文件里没有ckpt或者pth文件 是不是在Mac Arm平台下完全不支持训练?

eisneim avatar Jun 02 '23 14:06 eisneim

now i'm progressing #313

Tps-F avatar Jun 02 '23 14:06 Tps-F

here is my quick solution and it works: change just one line in train_nsf_sim_cache_sid_load_pretrain.py

def main():
     n_gpus = torch.cuda.device_count()

change to this:

def main():
     n_gpus = torch.backends.mps.is_available() ? 1 : torch.cuda.device_count()

and it works, here are some ouputs

...
INFO:eisneim:loss_disc=2.876, loss_gen=2.512, loss_fm=8.741,loss_mel=18.739, loss_kl=1.106
INFO:eisneim:====> Epoch: 67 [2023-06-03 06:55:52] | (0:10:21.757121)
INFO:eisneim:====> Epoch: 68 [2023-06-03 07:06:17] | (0:10:24.585755)
INFO:eisneim:====> Epoch: 69 [2023-06-03 07:16:43] | (0:10:25.606220)
INFO:eisneim:Train Epoch: 70 [84%]

eisneim avatar Jun 03 '23 00:06 eisneim

Sure, i know but It will be a temporary fix, That may cause a permanent loop to continue in some environments. I'm trying to figure out what caused that...

Tps-F avatar Jun 03 '23 01:06 Tps-F

I think I might have figured out why.

Tps-F avatar Jun 03 '23 02:06 Tps-F

Fixed bf1170012564463e9b9f13e7174b1eb499bdcbd2

Tps-F avatar Jun 05 '23 11:06 Tps-F