D S Pavan Kumar

Results 21 comments of D S Pavan Kumar

Hello Kevin, Thank you for the suggestion. Can you comment on how big the WER difference is, and on how many hours of data the training setup used? When I...

It runs with tensorflow as well. I set theano for better parallel decoding. Each tensorflow process occupies all the GPU memory by default, so if we have to share a...

It may be a memory issue, especially if you are running all the jobs on the same machine. Parallelising it on a sun grid engine can help. Otherwise, an easy...

> steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/... There is no argument to --cmd. You probably didn't run cmd.sh correctly. > FileNotFoundError: [Errno 2] No such file or directory: 'am-info': 'am-info' The...

Yes, 44.1kHz sampling frequency (or anything greater than 20kHz) requires a larger FFT size (because 25ms window has a sample count larger than 512). I updated the scripts to change...

Hi Aashish, MFCC implementations can be a lot different. Mine closely follows that of HTK; I am not aware of how exactly librosa implements it. Some general implementation differences are:...

No, it currently doesn't support the first and second order differences. However, it's simple to compute them.

Mel-flooring ensures that the computed log filterbank energies are non-negative. If 0

Thanks for pointing out, it did not distinguish multi-channel data and behaved incorrectly in such cases. For now I added a short code to throw an exception for multi-channel data....