DiffGAN-TTS icon indicating copy to clipboard operation
DiffGAN-TTS copied to clipboard

On Input Output Convolutional Mismatch during Training

Open wangxuanji opened this issue 3 months ago • 0 comments

I will encounter problems when training to validation, which is 1000 steps

Traceback (most recent call last):███████████████████████████████████████████████████████████████████████████| 99/99 [12:35<00:00, 2.46s/it] File "train.py", line 321, in | 0/4 [00:00<?, ?it/s] main(args, configs) File "train.py", line 196, in main figs, wav_reconstruction, wav_prediction, tag = synth_one_sample( File "/home/wxk/diff/DiffGAN-TTS-main/utils/tools.py", line 227, in synth_one_sample mels = [mel_pred[0, :mel_len].float().detach().transpose(0, 1) for mel_pred in diffusion.sampling(cond=cond)] File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/wxk/diff/DiffGAN-TTS-main/model/diffusion.py", line 162, in sampling x = self.p_sample(xs[-1], torch.full((b,), i, device=device, dtype=torch.long), cond, spk_emb) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/wxk/diff/DiffGAN-TTS-main/model/diffusion.py", line 124, in p_sample x_0_pred = self.denoise_fn(x_t, t, cond, spk_emb) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/wxk/diff/DiffGAN-TTS-main/model/modules.py", line 618, in forward x, skip_connection = layer(x, conditioner, diffusion_step, speaker_emb) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/wxk/diff/DiffGAN-TTS-main/model/blocks.py", line 670, in forward conditioner = self.conditioner_projection(conditioner) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/wxk/diff/DiffGAN-TTS-main/model/blocks.py", line 191, in forward conv_signal = self.conv(signal) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 307, in forward return self._conv_forward(input, self.weight, self.bias) File "/home/wxk/anaconda3/envs/diffgan/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 303, in _conv_forward return F.conv1d(input, weight, bias, self.stride, RuntimeError: Given groups=1, weight of size [256, 256, 1], expected input[32, 839, 256] to have 256 channels, but got 839 channels instead

How should we solve this

wangxuanji avatar Mar 13 '24 07:03 wangxuanji