asr-study
asr-study copied to clipboard
data parser doesn't work
the example from README.md python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser
returns the following error:
File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 32, in
regex=True) File "utils/generic_utils.py", line 62, in get_from_module (name, module, ', '.join(members.keys()))) KeyError: 'brsp not found in datasets*.\n Valid values are: dummy, sid, brsd, voxforge, lapsbm, cslu, datasetparser'
If I change brsp for brsd (which is the available parser in dataset folder) then
datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/lapsbm/LapsBM-F019/LapsBM_0378.wav has a forbidden label: "acertou o alvo em quarenta e três por cento das suas chances". Skipping Traceback (most recent call last): File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 46, in
override=args.override) File "datasets/dataset_parser.py", line 128, in to_h5 group = f.create_group(dataset) File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group gid = h5g.create(self.id, name, lcpl=lcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804) File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929) ValueError: Unable to create group (Name already exists)
The warning appears for each line of text and is skipping it. How can I prepare data to training? I have already downloaded the data in data folder.
Sorry about this problem. I have changed the behavior of label_parser and did not update the README.md. Instead
python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser
please use
python -m extras.make_dataset --parser brsd --input_parser mfcc
to create the dataset. Then I think that the rest of the README.md is fine.
Let me know anything!
I think that there are already some problems. Some of them are because the path to wav files is not correct e.g. asr-study/data/voxforge/brunox-20110225-wqa/216.wav should be asr-study/data/voxforge/brunox-20110225-wqa/ wav/ 216.wav Other files simply don't exist. Also there are some warnings about forbidden labels and I don't know exactly what that means.
If you know how I can run the data parser command properly, please tell me :) regards!
python -m extras.make_dataset --parser brsd --input_parser mfcc
datasets.dataset_parser.VoxForge: ERROR File /data/forked/asr-study/data/voxforge/brunox-20110225-wqa/216.wav not found datasets.dataset_parser.BRSD: WARNING Skipping dataset cslu: Dataset directory provided is not a directory datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014053.wav has a forbidden label: "". Skipping datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014054.wav has a forbidden label: "". Skipping datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014055.wav has a forbidden label: "". Skipping datasets.dataset_parser.Sid: ERROR File /data/forked/asr-study/data/sid/M0001/M0001000.wav not found could not be converted in int.ROR age Traceback (most recent call last): File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 46, in
override=args.override) File "datasets/dataset_parser.py", line 128, in to_h5 group = f.create_group(dataset) File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group gid = h5g.create(self.id, name, lcpl=lcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804) File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929) ValueError: Unable to create group (Name already exists)