pase Audio buffer and Padding size problems

Hi, I am trying to train pase model from scratch and I get the following two errors, Audio buffer is not finite everywhere and Padding size should be less than the corresponding input dimension , while training the model. To fix the first problem, I tried to add np.nan_to_num(y) before the 706th, but I think this trial is not a good solution. I have no idea to two problems. Any suggestion?

Audio buffer is not finite everywhere

Traceback (most recent call last): File "train.py", line 465, in train(opts) File "train.py", line 333, in train Trainer.train_(dloader, device=device, valid_dataloader=va_dloader) File "/home/teinhonglo/pase/pase/models/WorkerScheduler/trainer.py", line 223, in train_ batch = next(iterator) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 838, in _next_data return self._process_data(data) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) librosa.util.exceptions.ParameterError: Caught ParameterError in DataLoader worker process 8. Original Traceback (most recent call last): File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/teinhonglo/pase/pase/dataset.py", line 492, in getitem pkg = self.transform(pkg) File "/usr/local/bin/.local/lib/python3.5/site-packages/torchvision/transforms/transforms.py", line 70, in call img = t(img) File "/home/teinhonglo/pase/pase/transforms.py", line 706, in call hop_length=self.hop, File "/usr/local/bin/.local/lib/python3.5/site-packages/librosa/feature/spectral.py", line 1442, in mfcc S = power_to_db(melspectrogram(y=y, sr=sr, **kwargs)) File "/usr/local/bin/.local/lib/python3.5/site-packages/librosa/feature/spectral.py", line 1531, in melspectrogram power=power) File "/usr/local/bin/.local/lib/python3.5/site-packages/librosa/core/spectrum.py", line 1557, in _spectrogram S = np.abs(stft(y, n_fft=n_fft, hop_length=hop_length))**power File "/usr/local/bin/.local/lib/python3.5/site-packages/librosa/core/spectrum.py", line 161, in stft util.valid_audio(y) File "/usr/local/bin/.local/lib/python3.5/site-packages/librosa/util/utils.py", line 170, in valid_audio raise ParameterError('Audio buffer is not finite everywhere') librosa.util.exceptions.ParameterError: Audio buffer is not finite everywhere

Padding size should be less than the corresponding input dimension

Epoch 0/10: 5%|#####3 | 242/5205 [05:53<3:40:07, 2.66s/it] Traceback (most recent call last): File "train.py", line 465, in train(opts) File "train.py", line 333, in train Trainer.train_(dloader, device=device, valid_dataloader=va_dloader) File "/home/teinhonglo/pase/pase/models/WorkerScheduler/trainer.py", line 223, in train_ batch = next(iterator) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 345, in next data = self._next_data() File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 838, in _next_data return self._process_data(data) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/_utils.py", line 394, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 2. Original Traceback (most recent call last): File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/teinhonglo/.local/lib/python3.5/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/teinhonglo/pase/pase/dataset.py", line 492, in getitem pkg = self.transform(pkg) File "/usr/local/bin/.local/lib/python3.5/site-packages/torchvision/transforms/transforms.py", line 70, in call img = t(img) File "/home/teinhonglo/pase/pase/transforms.py", line 427, in call pkg['chunk_rand'] = self.select_chunk(raw_rand) File "/home/teinhonglo/pase/pase/transforms.py", line 317, in select_chunk mode=self.pad_mode).view(-1) File "/usr/local/bin/.local/lib/python3.5/site-packages/torch/nn/functional.py", line 2868, in pad return torch._C._nn.reflection_pad1d(input, pad) RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (0, 28656) at dimension 2 of input [1, 1, 3344]

Mar 11 '20 01:03 teinhonglo

Hey did you segment your data? I think i got a similar error when I didnt

Mar 11 '20 06:03 MittalShruti

Hey did you segment your data? I think i got a similar error when I didnt

Did you mean that segment data to train/valid/test set? I write train/valid to train.scp, test to test.scp and I create a symbolic links to train/valid/test wavs in data/wavs.

Mar 12 '20 09:03 teinhonglo

No, check the script at /data/prep/prepare_segmented_dataset_libri.py

Mar 13 '20 09:03 MittalShruti

For me using soundfile.read instead of torchaudio.load solved the issue with paddings (I used .ogg files, not wavs)

Jan 12 '21 11:01 pollytur

Hello! I have been replicating this experiment recently, but during the process of making the dataset config file, do I know where to obtain these files. (-- train_scp data/LibriSpeed/libri_tr.scp -- test_scp data/LibriSpeed/libri_te.scp\

--Libri_ Dict data/LibriSpeed/Libri_ Dict. npy). I look forward to your reply very much. Thank you.

Sep 07 '23 12:09 uuwz

pase pase copied to clipboard

Audio buffer and Padding size problems

Audio buffer is not finite everywhere

Padding size should be less than the corresponding input dimension

pase
pase copied to clipboard