SRD-VC 关于demo.py的问题

作者您好，我在运行demo.py的时候，发现有from autovc.synthesis import build_model。我从autovc中复制了synthesis.py到我的目录下，但还是有错误。 Traceback (most recent call last): File "demo.py", line 140, in model = build_model().to(device) File "/data2/panl/SRD-VC-master/My_model/synthesis.py", line 22, in build_model out_channels=hparams.out_channels, AttributeError: 'HParams' object has no attribute 'out_channels' 请问hparams.py也要用autovc中的吗？

Oct 06 '22 08:10 3030xx-stack

你提的问题非常好，是的，hparams.py也要用autovc中的，因为My_model里的参数import是通过https://github.com/YoungSeng/SRD-VC/blob/d225c47455b5c67e94daeb91b8b98781c43932ed/My_model/demo.py#L5

为了使用方便，我也上传了我的autovc文件夹，其中修改了一行代码：https://github.com/YoungSeng/SRD-VC/blob/d225c47455b5c67e94daeb91b8b98781c43932ed/autovc/synthesis.py#L11

Oct 06 '22 17:10 YoungSeng

作者您好，由于我代码基础比较差。我有一些问题想问一下。第一 data_split.py是必须的吗？第二，我运行这个程序得不到您VCTK里的所有文件。第三，root_dir='/ceph/home/yangsc21/Python/VCTK/wav16/spmel_100_crop_cat/', feat_dir='/ceph/home/yangsc21/Python/VCTK/wav16/raptf0_100_crop_cat/',这两个文件是如何得到的呢？希望能得到您的回复，谢谢！

Oct 13 '22 13:10 3030xx-stack

不是必须的，只是一般我们需要划分训练集、验证集和测试集，另外我选出来内容和长度一样的音频加入测试集方便做MCD，长度不一样的话后面也可以用DTW，以及选了长度不超过128*3的音频。如果你基础一般，我建议你可以先只用两个说话人的少量音频数据写代码，这样方便debug；
data_split.py写的比较乱，你可以重新写一下，这个只是参考，有些代码可以尝试取消注释；
这两个文件夹是把所有的mel谱和音高曲线拼到一起，正如Speechsplit一样，我刚刚上传了我项目中的wav_cat.py文件，可以作为参考

Oct 13 '22 14:10 YoungSeng

作者您好，我在训练时碰到了这个问题，可以指点一下吗？ Traceback (most recent call last): File "main.py", line 81, in main(config) File "main.py", line 32, in main solver.train() File "/data/SRD-VC-master/My_model/solver.py", line 300, in train x_identic, mel_outputs_postnet, spk_pred, content_pred, pitch_predict = self.mi_second_forward(content, pitch, rhythm, File "/data/SRD-VC-master/My_model/solver.py", line 177, in mi_second_forward x_identic, mel_outputs_postnet, spk_pred, content_pred, pitch_predict = self.G2( File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/data/SRD-VC-master/My_model/model.py", line 428, in forward encoder_outputs = torch.cat((code_exp_1, code_exp_2, code_exp_3, RuntimeError: Sizes of tensors must match except in dimension 1. Got 384 and 192 (The offending index is 0)

Oct 19 '22 12:10 3030xx-stack

你pdb看一下code_exp_1, code_exp_2, code_exp_3的维度，有384和192的，不能cat到一起

Oct 19 '22 12:10 YoungSeng

(Pdb) print(code_exp_1.shape) torch.Size([16, 192, 16]) (Pdb) print(code_exp_2.shape) torch.Size([16, 192, 2]) (Pdb) print(code_exp_3.shape) torch.Size([16, 192, 64]) (Pdb) print(code_exp_4.shape) torch.Size([16, 256]) (Pdb) print(code_exp_4.unsqueeze(1).expand(-1, 128*3, -1).shape) torch.Size([16, 384, 256]) 这是我断点的打印的结果

Oct 19 '22 13:10 3030xx-stack

那看起来如果要cat的话需要第二个维度一致，可以尝试一下：code_exp_4.unsqueeze(1).expand(-1, 192, -1)

或者把超参数里的MAX_LEN=192改成384试一试

Oct 19 '22 14:10 YoungSeng

这个我试了，还是不行。超参数里没有MAX_LEN，您说的是max_len_pad = 384吗

Oct 19 '22 14:10 3030xx-stack

Traceback (most recent call last): File "main.py", line 81, in main(config) File "main.py", line 32, in main solver.train() File "/data/SRD-VC-master/My_model/solver.py", line 246, in train x_real_org, emb_org, f0_org, len_org = next(data_iter) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in next data = self._next_data() File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/data/SRD-VC-master/My_model/data_loader.py", line 112, in call left = np.random.randint(0, len(aa) - len_crop[0], size=2) File "mtrand.pyx", line 747, in numpy.random.mtrand.RandomState.randint File "_bounded_integers.pyx", line 1254, in numpy.random._bounded_integers._rand_int64 ValueError: low >= high 继续报错

Oct 19 '22 14:10 3030xx-stack

我进行了code_exp_4.unsqueeze(1).expand(-1, 192, -1)这个尝试，然后将上面的报错改成了left = np.random.randint(0, abs(len(aa) - len_crop[0]), size=2)，可以训练起来了，但是不知道可以这样吗

Oct 19 '22 14:10 3030xx-stack

可以，但是我建议最好检查一下len(aa) 与 len_crop[0]的大小关系，解决low >= high的问题。

你这么写可能导致有些数据不能被随机数访问到。

Oct 19 '22 14:10 YoungSeng

作者您好，demo.py和inferen.py里面所使用的gpu的id是不能修改吗？ Traceback (most recent call last): File "demo.py", line 108, in x_identic_val = Generator_F( File "demo.py", line 83, in Generator_F , mel, _, _, _ = G2(content, pitch, rhythm, mel_2, MAX_LEN) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise raise self.exc_type(msg) TypeError: Caught TypeError in replica 0 on device 0. Original Traceback (most recent call last): File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, **kwargs) File "/home/lili/anaconda3/envs/srd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) TypeError: forward() takes 5 positional arguments but 6 were given 碰到这种问题我该如何解决呢？

Oct 25 '22 08:10 3030xx-stack

你的这个问题好像不是gpu id的问题？这个错误是说G2需要5个输入参数content, pitch, rhythm, mel_2, MAX_LEN但是你输入了六个

Oct 25 '22 12:10 YoungSeng

谢谢回复，已经解决了。

Oct 25 '22 12:10 3030xx-stack

作者您好，我想问一下valid_path = "/ceph/home/yangsc21/Python/autovc/SpeechSplit/assets/test_mel/test.pkl"这里您是使用了两条语句做测试还是说将3位测试说话人的所有语句都用上去了？我一开始是将三个说话人的每个说话人所有语句拼接在一起，但似乎不可行。

Oct 26 '22 00:10 3030xx-stack

不是的，我看了一下我应该是用的p225_001的音频和p232_001的数据进行的debug测试，我记得好像没有上传SpeechSplit相应的数据吧，因为它只能集内数据进行VC，不能进行one-shot VC，我还是上传了我的SpeechSplit文件夹，其中test.pkl，希望对你有帮助。

Oct 26 '22 01:10 YoungSeng

作者您好，我想问一下您的SpeechSplit文件夹，其中test.pkl是如何生成的，我查看了一下，它与我使用My_model/make_test_metadata.py生成的内容不一样

我上传了处理这个文件的代码：https://github.com/YoungSeng/SRD-VC/blob/master/SpeechSplit/make_test_metadata.py，希望对你有帮助

Nov 01 '22 03:11 YoungSeng

我在这上面回复了，请查收

https://github.com/YoungSeng/SRD-VC/issues/8

------------------ 原始邮件 ------------------ 发件人: "YoungSeng/SRD-VC" @.>; 发送时间: 2022年10月27日(星期四) 下午5:20 @.>; @.@.>; 主题: Re: [YoungSeng/SRD-VC] 关于demo.py的问题 (Issue #8)

作者您好，我想问一下您的SpeechSplit文件夹，其中test.pkl是如何生成的，我查看了一下，它与我使用My_model/make_test_metadata.py生成的内容不一样

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Nov 01 '22 03:11 YoungSeng

如果我需要用中文训练，对梅尔频谱的提取需要做改变吗？

Nov 01 '22 03:11 3030xx-stack

需要的，mel extractor和vocoder都是一套的，现在这个是在英文数据集上的，如果需要中文的需要自己重新训练一下或者找一下有没有中文上预训练好的vocoder的mel extractor

Nov 01 '22 04:11 YoungSeng

Mel extractor?意思是直接用中文训练一个声码器吗？

Nov 01 '22 04:11 3030xx-stack

mel提取器，我强烈建立先从AutoVC入手，本项目的mel提取和speechsplit是一样的：https://github.com/auspicious3000/SpeechSplit/blob/10fd57e8fe2570010bbf6dd18ec210c41efe7ddd/make_spect_f0.py#L57

是的，可以重新训练，但是我建议使用别人预训练好的模型，然后替代掉我们这些项目中的mel extractor和vocoder，你可以上这个上面找一找有没有预训练好的模型：https://huggingface.co/

希望对你有帮助

Nov 01 '22 11:11 YoungSeng

SRD-VC SRD-VC copied to clipboard

关于demo.py的问题

SRD-VC
SRD-VC copied to clipboard