dvector
dvector copied to clipboard
cannot reshape tensor of 0 elements into shape [-1, 0]
When the input tensor shape is [1, 800] or [1, 320] and When I use the following code
mel_tensor = wav2mel(wav_tensor, 16000) # 16000 is the sample rate
I met with the following error:
Traceback of TorchScript, serialized code (most recent call last): File "code/torch/data/wav2mel.py", line 20, in forward sample_rate: int) -> Tensor: wav_tensor0 = (self.sox_effects).forward(wav_tensor, sample_rate, ) mel_tensor = (self.log_melspectrogram).forward(wav_tensor0, ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE return mel_tensor class SoxEffects(Module): File "code/torch/data/wav2mel.py", line 43, in forward def forward(self: torch.data.wav2mel.LogMelspectrogram, wav_tensor: Tensor) -> Tensor: _3 = (self.melspectrogram).forward(wav_tensor, ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE mel_tensor = torch.numpy_T(torch.squeeze(_3, 0)) _4 = torch.clamp(mel_tensor, 1.0000000000000001e-09, None) File "code/torch/torchaudio/transforms.py", line 20, in forward def forward(self: torch.torchaudio.transforms.MelSpectrogram, waveform: Tensor) -> Tensor: specgram = (self.spectrogram).forward(waveform, ) ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE mel_specgram = (self.mel_scale).forward(specgram, ) return mel_specgram File "code/torch/torchaudio/transforms.py", line 41, in forward waveform: Tensor) -> Tensor: _0 = torch.torchaudio.functional.functional.spectrogram _1 = _0(waveform, 0, self.window, 400, 160, 400, 2., False, self.center, self.pad_mode, self.onesided, ) ~~ <--- HERE return _1 class MelScale(Module): File "code/torch/torchaudio/functional/functional.py", line 18, in spectrogram waveform0 = waveform shape = torch.size(waveform0) waveform2 = torch.reshape(waveform0, [-1, shape[-1]]) ~~~~~~~~~~~~~ <--- HERE spec_f = torch.torch.functional.stft(waveform2, n_fft, hop_length, win_length, window, center, pad_mode, False, onesided, True, ) _0 = torch.slice(shape, 0, -1, 1)
Traceback of TorchScript, original code (most recent call last): File "/home/yist/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torchaudio/transforms.py", line 96, in forward Fourier bins, and time is the number of window hops (n_frame). """ return F.spectrogram( ~~~~~~~~~~~~~ <--- HERE waveform, self.pad, File "/home/yist/.pyenv/versions/3.8.5/lib/python3.8/site-packages/torchaudio/functional/functional.py", line 88, in spectrogram # pack batch shape = waveform.size() waveform = waveform.reshape(-1, shape[-1]) ~~~~~~~~~~~~~~~~ <--- HERE
# default values are consistent with librosa.core.spectrum._spectrogramRuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
How can I solve this problem?
Check the length of your audio files, this was only happening for very short clips for me.
In my case, substituting wav2mel.pt with source code of class Wav2Mel in data/wav2mel.py and delete ["silence", ...] in self.effects solves the problem.