E2E-ASR
E2E-ASR copied to clipboard
The label decoded is not the same as the pmap data.
@HawkAaron I have met a mistake that the result decoded includes [52,53,54,55,56,57,......], while the rephone length is just 51. Hence,I met a bug as followes: the one of y (the result of decoded ) is as followes: (53, 59, 53, 59, 56, 53, 43, 53, 43, 53, 5, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 25, 53, 25, 53, 48, 53, 59, 53, 25, 53, 43, 53, 43, 53, 43, 53, 59, 53, 59, 31, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 43, 53, 25, 53, 25, 53, 43, 53, 35, 53, 43, 53, 31, 53, 43, 53, 43, 53, 31, 53, 43, 53, 48, 53, 59, 5, 53, 32, 53, 43, 53, 59, 43, 31, 53, 59, 53, 25, 53, 25, 5, 25, 32, 53, 43, 53, 43, 53, 43, 53, 43, 53, 59, 53, 31, 53, 59, 53, 59) Traceback (most recent call last): File "eval.py", line 93, in decode() File "eval.py", line 84, in decode y = [pmap[rephone[i]] for i in y] File "eval.py", line 84, in y = [pmap[rephone[i]] for i in y] KeyError: 53
Why my pmap'(or y) length is not as long as the rephone ? And one of the phoneme such as 'sil' is not included in the pmap dict
I am enchanting similar error when decoding with the TIMIT data set. I fix this by change the phone mapping part in eval.py. here is my code
with open('conf/phones.60-48-39.map', 'r') as f:
pmap = {rephone[1]: rephone[1]}
for line in f:
line = line.split()
if len(line) < 2: pmap[line[0]] = rephone[0]
else: pmap[line[1]] = line[1]
print(pmap)
@zhoudoufu why change from pmap[line[0]] = line[2] to pmap[line[1]] = line[1]? and pmap = {rephone[1]: rephone[1]} not pmap = {rephone[0]: rephone[0]}
So, dear friends, Why does this happen? I mean what bugs result in this kind of problem? I meet with the same one , either.
@gccyxy Hello, sir, I meet with the same problem as yours. So do you have some advice? Many thanks and best.
I am enchanting similar error when decoding with the TIMIT data set. I fix this by change the phone mapping part in eval.py. here is my code
with open('conf/phones.60-48-39.map', 'r') as f: pmap = {rephone[1]: rephone[1]} for line in f: line = line.split() if len(line) < 2: pmap[line[0]] = rephone[0] else: pmap[line[1]] = line[1] print(pmap)
I replaced else: pmap[line[0]] = line[2]
to else: pmap[line[1]] = line[2]
and the final PER
for RNN-T
is 20.62%
with 183 epoches of training.