D S Pavan Kumar comments

Results 21 comments of


                                            D S Pavan Kumar

changed the feature stream pipelines

Hello Kevin, Thank you for the suggestion. Can you comment on how big the WER difference is, and on how many hours of data the training setup used? When I...

Can this code work on a tensorflow trained model

It runs with tensorflow as well. I set theano for better parallel decoding. Each tensorflow process occupies all the GPU memory by default, so if we have to share a...

Can this code work on a tensorflow trained model

It may be a memory issue, especially if you are running all the jobs on the same machine. Parallelising it on a sun grid engine can help. Otherwise, an easy...

Can this code work on a tensorflow trained model

Okay.

problem during runing run_kt.sh

> steps/align_si.sh --nj 4 --cmd /kaldi-trunk/egs/... There is no argument to --cmd. You probably didn't run cmd.sh correctly. > FileNotFoundError: [Errno 2] No such file or directory: 'am-info': 'am-info' The...

Can I modify the sample rate?

Yes, 44.1kHz sampling frequency (or anything greater than 20kHz) requires a larger FFT size (because 25ms window has a sample count larger than 512). I updated the scripts to change...

comparison with librosa

Hi Aashish, MFCC implementations can be a lot different. Mine closely follows that of HTK; I am not aware of how exactly librosa implements it. Some general implementation differences are:...

does it support first order n second order coefficients for mfcc

No, it currently doesn't support the first and second order differences. However, it's simple to compute them.

What's 'Apply Mel-flooring' for?

Mel-flooring ensures that the computed log filterbank energies are non-negative. If 0

Does this code distinguish between mono and stereo audio?

Thanks for pointing out, it did not distinguish multi-channel data and behaved incorrectly in such cases. For now I added a short code to throw an exception for multi-channel data....