Junbo Zhang

Results 14 comments of Junbo Zhang

CircleCL reported two errors, but I didn't find the reason. The error message: ``` _________________ ERROR collecting tests/test_dataset_cards.py _________________ tests/test_dataset_cards.py:53: in @pytest.mark.parametrize("dataset_name", get_changed_datasets(repo_path)) tests/test_dataset_cards.py:35: in get_changed_datasets diff_output = check_output(["git", "diff",...

I think you are right. In the `speechocean762` dataset, 50% of speakers have good pronunciation, 25% of speakers have so-so pronunciation, and the rest 25% have poor pronunciation. However, ever...

I just balanced the training data with a small trick, and the performance looks better. The new version is on the following branch: https://github.com/jimbozhang/kaldi/tree/jzhang.gop.balanced_traindata Could @thangdc94 please check it?

Thanks very much for your testing and suggestion. As future works, word-level and sentence-level scoring are planned to be added to this recipe, but for now we do not plan...

In my testing, same as Kamairo's case, NNPack FC layer's result is indeed difference from the internal's result. I tried to fix this bug but failed. 😞

> Thanks for your contribution, @jimbozhang. Are you still interested in adding this dataset? > > We are removing the dataset scripts from this GitHub repo and moving them to...

Sorry for the delayed reply. @wangkenpu Regarding the LPR computation, I agree with you but I believe it should not significantly impact the result. Regarding the document, I appreciate you...

Is it time enough if I test it next week? Or maybe @pzelasko has time to do this. :neutral_face:

Let me try it under your suggestion and make a PR to fix it.

Yes, it is much faster! 👍 Use 48 cores on `('dev-clean', 'test-clean')`: ``` real 0m47.141s user 3m11.562s sys 1m39.085s ``` Use 72 cores on `('dev-clean', 'test-clean', 'train-clean-100')`: ``` real 6m20.349s...