End-to-end-ASR-Pytorch icon indicating copy to clipboard operation
End-to-end-ASR-Pytorch copied to clipboard

Single file Inference

Open shamil-kadavan opened this issue 5 years ago • 2 comments

Hi team, I want to do a single audio file inference. Can anyone pls help me with this?

shamil-kadavan avatar Jan 31 '20 14:01 shamil-kadavan

Hello, @shamil-kadavan. I think when you use bucketing, it implements squeeze(inputs, dim=0). e.g. input shape => (1, 440, 40) implement squeeze shape => (440, 40)

So RNN can't forward that input shape

Below, example printing gold and pred converted characters.

try:
        for line in (torch.max(batch_label,dim=-1)[1]).numpy():
            tmp = ''
            #print(line)
            for idx in line:
                if idx == idx2char.index('<sos>'): continue
                if idx == idx2char.index('<eos>'): break
                tmp += idx2char[idx]
            gt.append(tmp)
    except:
        tmp = ''
        for idx in (torch.max(batch_label,dim=-1)[1]).numpy():
            if idx == idx2char.index('<sos>'): continue
            if idx == idx2char.index('<eos>'): break
            tmp += idx2char[idx]
        gt.append(tmp)

dobby-seo avatar Feb 28 '20 12:02 dobby-seo

@qute012 Thanks for the answer I am still confused how to do single file inference. My knowledge in speech recognition is very minimal.

shamil-kadavan avatar Mar 03 '20 10:03 shamil-kadavan