asr-study icon indicating copy to clipboard operation
asr-study copied to clipboard

data parser doesn't work

Open zparcheta opened this issue 7 years ago • 2 comments

the example from README.md python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser returns the following error:

File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 32, in regex=True) File "utils/generic_utils.py", line 62, in get_from_module (name, module, ', '.join(members.keys()))) KeyError: 'brsp not found in datasets*.\n Valid values are: dummy, sid, brsd, voxforge, lapsbm, cslu, datasetparser'

If I change brsp for brsd (which is the available parser in dataset folder) then

datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/lapsbm/LapsBM-F019/LapsBM_0378.wav has a forbidden label: "acertou o alvo em quarenta e três por cento das suas chances". Skipping Traceback (most recent call last): File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 46, in override=args.override) File "datasets/dataset_parser.py", line 128, in to_h5 group = f.create_group(dataset) File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group gid = h5g.create(self.id, name, lcpl=lcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804) File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929) ValueError: Unable to create group (Name already exists)

The warning appears for each line of text and is skipping it. How can I prepare data to training? I have already downloaded the data in data folder.

zparcheta avatar Oct 06 '17 14:10 zparcheta

Sorry about this problem. I have changed the behavior of label_parser and did not update the README.md. Instead

python -m extras.make_dataset --parser brsp \ --input_parser mfcc --label_parser simple_char_parser

please use

python -m extras.make_dataset --parser brsd --input_parser mfcc

to create the dataset. Then I think that the rest of the README.md is fine.

Let me know anything!

igormq avatar Oct 12 '17 01:10 igormq

I think that there are already some problems. Some of them are because the path to wav files is not correct e.g. asr-study/data/voxforge/brunox-20110225-wqa/216.wav should be asr-study/data/voxforge/brunox-20110225-wqa/ wav/ 216.wav Other files simply don't exist. Also there are some warnings about forbidden labels and I don't know exactly what that means.

If you know how I can run the data parser command properly, please tell me :) regards!

python -m extras.make_dataset --parser brsd --input_parser mfcc

datasets.dataset_parser.VoxForge: ERROR File /data/forked/asr-study/data/voxforge/brunox-20110225-wqa/216.wav not found datasets.dataset_parser.BRSD: WARNING Skipping dataset cslu: Dataset directory provided is not a directory datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014053.wav has a forbidden label: "". Skipping datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014054.wav has a forbidden label: "". Skipping datasets.dataset_parser.BRSD: WARNING File /data/forked/asr-study/data/sid/F0014/F0014055.wav has a forbidden label: "". Skipping datasets.dataset_parser.Sid: ERROR File /data/forked/asr-study/data/sid/M0001/M0001000.wav not found could not be converted in int.ROR age Traceback (most recent call last): File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/home/zparcheta/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/data/forked/asr-study/extras/make_dataset.py", line 46, in override=args.override) File "datasets/dataset_parser.py", line 128, in to_h5 group = f.create_group(dataset) File "/home/zparcheta/anaconda2/lib/python2.7/site-packages/h5py/_hl/group.py", line 52, in create_group gid = h5g.create(self.id, name, lcpl=lcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2846) File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/_objects.c:2804) File "h5py/h5g.pyx", line 151, in h5py.h5g.create (/home/ilan/minonda/conda-bld/h5py_1490028130695/work/h5py/h5g.c:2929) ValueError: Unable to create group (Name already exists)

zparcheta avatar Oct 31 '17 11:10 zparcheta