ericbolo comments

Results 35 comments of


                                            ericbolo

ReadToken errors in multi-gpu training

@mixcoder, I am now encountering the same error in a multi-GPU setting, probably for the same reason, i.e. the master attempting to read a file that has not been written...

ReadToken errors in multi-gpu training

I wrote a quick hack to pinpoint the problem. In net/communicator.h, adding a sleep(1) (1 second) before it reads from the log file fixes the issue but adds a 1...

ReadToken errors in multi-gpu training

A colleague and I wrote a cleaner fix. The master job attempts to read from the subjob output file only if the file exists AND is not empty. in src/base/kaldi-utils.cc,...

ReadToken errors in multi-gpu training

I do use NFS, I am storing the data on the cloud on AWS EBS . Storing the data locally would imply uploading all the data to my cloud instance...

ReadToken errors in multi-gpu training

Since this issue for NFS-only, I won't submit the pull request. However, if someone is on an NFS system (for instance, an AWS EC2), I noticed a mistake in the...

Eesen for Handwriting Recognition

@wellescastro , I'm curious, any luck with using EESEN for handwriting recognition?

Token Accuracy Drops Obj(log[Pzx])=nan

This is related to the accuracy drop. I was getting nan values and abnormally high values for Obj(log(p(x|z))) while training the model on tedlium (v1). Reducing the number of layers...

Importance of utterance lengths

Related question: I know CMVN (cepstral mean and variance normalization) can suffer from short utterances. In my current dataset I have only one locution per speaker. Has anyone trained on...

Importance of utterance lengths

Thank you, @fmetze ! This Kaldi module applies a sliding window for CMVN computation: http://kaldi-asr.org/doc/apply-cmvn-sliding_8cc.html However, I don't understand the advantage of sliding windows. Is it simply a kind of...

Importance of utterance lengths

A quick update: with regular CMVN, no sliding window, the phonetic model reaches 79% token accuracy. So the model learns fairly well in spite of there being short utterances, and...