WaveRNN icon indicating copy to clipboard operation
WaveRNN copied to clipboard

Train Tacotron Error - KeyError

Open aguazul opened this issue 5 years ago • 3 comments

After running preprocess.py I go on to run train_tacotron.py but it keeps giving this KeyError:

`(pyGPUenv) C:\Users\B\Documents\WaveRNN-master\WaveRNN-master>python train_tacotron.py 0 0 Using device: cuda

Initialising Tacotron Model...

Trainable Parameters: 11.088M Restoring from latest checkpoint... Loading latest weights: C:\Users\B\Documents\WaveRNN-master\WaveRNN-master\checkpoints\ljspeech_lsa_smooth_attention.tacotron\latest_weights.pyt Loading latest optimizer state: C:\Users\B\Documents\WaveRNN-master\WaveRNN-master\checkpoints\ljspeech_lsa_smooth_attention.tacotron\latest_optim.pyt +----------------+------------+---------------+------------------+ | Steps with r=7 | Batch Size | Learning Rate | Outputs/Step (r) | +----------------+------------+---------------+------------------+ | 10k Steps | 24 | 0.001 | 7 | +----------------+------------+---------------+------------------+

Traceback (most recent call last): File "train_tacotron.py", line 211, in main() File "train_tacotron.py", line 106, in main tts_train_loop(paths, model, optimizer, train_set, lr, training_steps, attn_example) File "train_tacotron.py", line 136, in tts_train_loop for i, (x, m, ids, _) in enumerate(train_set, 1): File "C:\Users\B\Anaconda3\envs\pyGPUenv\lib\site-packages\torch\utils\data\dataloader.py", line 346, in next data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "C:\Users\B\Anaconda3\envs\pyGPUenv\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\B\Anaconda3\envs\pyGPUenv\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\Users\B\Documents\WaveRNN-master\WaveRNN-master\utils\dataset.py", line 151, in getitem x = text_to_sequence(self.text_dict[item_id], hp.tts_cleaner_names) KeyError: 'wav_filename_0_050'

(pyGPUenv) C:\Users\B\Documents\WaveRNN-master\WaveRNN-master>`

I have ran this algorithm before but for some reason this time it is giving me issues. Any help would be appreciated.

Thank you! B

aguazul avatar Feb 05 '20 06:02 aguazul

I've done some more digging and found that each of my wav files are listed twice in the dictionary. So when line 151 of dataset.py is ran it generates a keyerror because duplicate keys were found, I think.

I don't know why each wav file is listed twice in the dictionary. I've ran train_tacotron.py before and it worked just fine...

aguazul avatar Feb 09 '20 05:02 aguazul

After about 2-3 hours of troubleshooting by backtracking through the code, I discovered that the text_dict dictionary had duplicate entries for each wav file. This is what was causing the KeyError because duplicate keys are not allowed in Python dictionaries.

After further debugging, and looking at the ljspeech function of recipies.py, I discovered that it was expecting the pipe (|) character as the delimiter of the CSV file, but I used comma (,) as the delimiter.

I resaved the csv file with | as the delimiter and now it works.

aguazul avatar Feb 09 '20 06:02 aguazul

I also got the same error! Where can I find the recipies.py file? Thanks in advance!

anantmulchandani avatar Dec 04 '21 20:12 anantmulchandani