Issue with generate_data on speech recognition model

Open nag0811 opened this issue 5 years ago • 0 comments

Description

Hi, I am getting the file not found error whenever I run t2t_problem.generate_data(DATA_DIR, TMP_DIR). Download of the data set was successful to TMP_DIR. ...

Environment information

Python 3.7.7 tensor2tensor 1.15.7

OS: <Windows 10>

$ pip freeze | grep tensor
# your output here

$ python -V
# your output here

For bugs: reproduction and error logs

# Steps to reproduce:
...

from tensor2tensor.utils import registry from tensor2tensor import models #registry.list_models() from tensor2tensor import problems import json

from tensor2tensor.utils.trainer_lib import create_hparams

Print all T2T problems to console

#problems.available()

#PROBLEM = 'translate_enfr_wmt32k_rev' PROBLEM = 'librispeech_clean' MODEL = 'TRANSFORMER' HPARAMS = 'transformer_base'

TRAIN_DIR = '~/translator/model_files' TMP_DIR = 'C:/MachineLearning/Speech_Recognition/temp/' DATA_DIR = 'C:/MachineLearning/Speech_Recognition/speech_data/'

t2t_problem = problems.problem(PROBLEM) t2t_problem.generate_data(DATA_DIR, TMP_DIR)

# Error logs:



t2t_problem.generate_data(DATA_DIR, TMP_DIR)
INFO:tensorflow:Not downloading, file already found: C:/MachineLearning/Speech_Recognition/temp/test-clean.tar.gz
INFO:tensorflow:Not downloading, file already found: C:/MachineLearning/Speech_Recognition/temp/test-clean.tar.gz
Traceback (most recent call last):

  File "<ipython-input-11-c1b5d9c269ae>", line 1, in <module>
    t2t_problem.generate_data(DATA_DIR, TMP_DIR)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\librispeech.py", line 169, in generate_data
    self.generator(data_dir, tmp_dir, self.TEST_DATASETS), test_paths)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\generator_utils.py", line 174, in generate_files
    for case in generator:

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\librispeech.py", line 149, in generator
    wav_data = audio_encoder.encode(media_file)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\audio_encoder.py", line 62, in encode
    convert_to_wav(s, out_filepath)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensor2tensor\data_generators\audio_encoder.py", line 51, in convert_to_wav
    call(args + [in_path, out_path])

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 339, in call
    with Popen(*popenargs, **kwargs) as p:

  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 105, in __init__
    super(SubprocessPopen, self).__init__(*args, **kwargs)

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 800, in __init__
    restore_signals, start_new_session)

  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 1207, in _execute_child
    startupinfo)

FileNotFoundError: [WinError 2] The system cannot find the file specified

...

Jul 22 '20 17:07 nag0811