YourTTS icon indicating copy to clipboard operation
YourTTS copied to clipboard

Issue with Input type and weight type should be the same

Open Ca-ressemble-a-du-fake opened this issue 2 years ago • 0 comments

Hi,

I am trying to train YourTTS on my own dataset. So I followed your helpful guide with the latest stable version of Coqui TTS (0.8.0).

After computing the embeddings (on GPU) without issue, I run into this RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same.

I have already trained a VITS model with this dataset so everything is already set up. I understood that input Tensor resides on GPU whereas weight Tensor resides on CPU but how can I solve this ? Should I downgrade to CoquiTTS 0.6.2 ?

Here is the full traceback :

File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 1533, in fit
    self._fit()
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 1517, in _fit
    self.train_epoch()
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 1282, in train_epoch
    _, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time)
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 1135, in train_step
    outputs, loss_dict_new, step_time = self._optimize(
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 996, in _optimize
    outputs, loss_dict = self._model_train_step(batch, model, criterion, optimizer_idx=optimizer_idx)
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/trainer/trainer.py", line 954, in _model_train_step
    return model.train_step(*input_args)
  File "/home/caraduf/YourTTS/TTS/TTS/tts/models/vits.py", line 1250, in train_step
    outputs = self.forward(
  File "/home/caraduf/YourTTS/TTS/TTS/tts/models/vits.py", line 1049, in forward
    pred_embs = self.speaker_manager.encoder.forward(wavs_batch, l2_norm=True)
  File "/home/caraduf/YourTTS/TTS/TTS/encoder/models/resnet.py", line 169, in forward
    x = self.torch_spec(x)
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/home/caraduf/YourTTS/yourtts_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/caraduf/YourTTS/TTS/TTS/encoder/models/base_encoder.py", line 22, in forward
    return torch.nn.functional.conv1d(x, self.filter).squeeze(1)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Thanks for helping me out!

Ca-ressemble-a-du-fake avatar Oct 04 '22 04:10 Ca-ressemble-a-du-fake