End-to-end-ASR-Pytorch
End-to-end-ASR-Pytorch copied to clipboard
Single file Inference
Hi team, I want to do a single audio file inference. Can anyone pls help me with this?
Hello, @shamil-kadavan. I think when you use bucketing, it implements squeeze(inputs, dim=0). e.g. input shape => (1, 440, 40) implement squeeze shape => (440, 40)
So RNN can't forward that input shape
Below, example printing gold and pred converted characters.
try:
for line in (torch.max(batch_label,dim=-1)[1]).numpy():
tmp = ''
#print(line)
for idx in line:
if idx == idx2char.index('<sos>'): continue
if idx == idx2char.index('<eos>'): break
tmp += idx2char[idx]
gt.append(tmp)
except:
tmp = ''
for idx in (torch.max(batch_label,dim=-1)[1]).numpy():
if idx == idx2char.index('<sos>'): continue
if idx == idx2char.index('<eos>'): break
tmp += idx2char[idx]
gt.append(tmp)
@qute012 Thanks for the answer I am still confused how to do single file inference. My knowledge in speech recognition is very minimal.