OpenVoice icon indicating copy to clipboard operation
OpenVoice copied to clipboard

istft

Open wkx-anything opened this issue 1 year ago • 4 comments

File ~/anaconda3/envs/Chat_GLM/lib/python3.10/site-packages/wavmark/models/my_model.py:27, in Model.istft(self, signal_wmd_fft) 23 def istft(self, signal_wmd_fft): 25 window = torch.hann_window(self.n_fft).to(signal_wmd_fft.device) ---> 27 return torch.istft(signal_wmd_fft, n_fft=self.n_fft, hop_length=self.hop_length, window=window, 28 return_complex=False)

RuntimeError: istft requires a complex-valued input tensor matching the output from stft with return_complex=True.

torch version 2.0.1 cuda 11.8

wkx-anything avatar Jan 17 '24 11:01 wkx-anything

how do I solve this?

wkx-anything avatar Jan 17 '24 11:01 wkx-anything

You have to downgrade torch, but I can't say to which version.

patriotyk avatar Jan 22 '24 21:01 patriotyk

If you don't want or can't downgrade torch, you can edit venv/lib/python3.10/site-packages/wavmark/models/my_model.py. Change the def istft(): function to read as follows:

def istft(self, signal_wmd_fft):
    window = torch.hann_window(self.n_fft).to(signal_wmd_fft.device)
    signal_wmd_fft = torch.view_as_complex(signal_wmd_fft)
    return torch.istft(signal_wmd_fft, n_fft=self.n_fft, hop_length=self.hop_length, window=window,
                       return_complex=False)

Note the addition of the line "signal..."

ricperry avatar Jan 31 '24 02:01 ricperry

The dependency in question is wavmark and must be version 0.0.3 to use OpenVoice with Torch 2.3.0.

lazzarello avatar May 14 '24 21:05 lazzarello