julius icon indicating copy to clipboard operation
julius copied to clipboard

palles77 need your help

Open Valery813 opened this issue 4 years ago • 14 comments

Dear palles77, you have trained a language model for Polish and English. Share with me the documentation or a sequence of actions on how to do it yourself. To make it clear to the novice. Files *.deduped is? what to do with them. I want to get a ready-made model for Julius or Kaldi.

Valery813 avatar Sep 23 '20 16:09 Valery813

Maybe there is a training video ?

Valery813 avatar Sep 23 '20 16:09 Valery813

Hi Valery. I appreciate your enthusiasm with willing to learn about creation of language models. It is a fairly involved process. I haven't got any video or documentation for that matter as everything is in my head. You probably should start by buying some books and learning about language models. There is quite a lot information about that on the Internet.

palles77 avatar Sep 23 '20 17:09 palles77

I understand, I will search for information. Thanks

Valery813 avatar Sep 23 '20 17:09 Valery813

I will soon publish these in a separate repo. Stay tuned.

palles77 avatar Feb 25 '21 07:02 palles77

This is excellent

Valery813 avatar Feb 25 '21 17:02 Valery813

Hello, have you done any transcribing with KALDI? I tried to build the SWE model, a lot of errors in the scripts.

Valery813 avatar Feb 26 '21 09:02 Valery813

No. Its too steep learning curve. My training scripts for Julius allow state of the art results. They are probably the best training scripts out there. I will post them soon in a separate repo.

palles77 avatar Feb 26 '21 10:02 palles77

Hello, what language models will there be?

Valery813 avatar Feb 27 '21 07:02 Valery813

It will be for English. But I am planning to start doing releases for other languages.

palles77 avatar Feb 28 '21 21:02 palles77

Hi, I'm waiting for the release of your model. How to lay out, tell me. ok? I made a model of the Swedish and Danish language for KALDI, the result of transcribing is sad. (%WER = ~5-10%)

Valery813 avatar Mar 09 '21 11:03 Valery813

I am slowly starting to prepare the stuff. Initial repo https://github.com/palles77/htk-cuda. More to follow soon.

palles77 avatar Mar 15 '21 11:03 palles77

Hi, how are you doing with the English model? I tested the aspire and the vosk model alphacep. The results are not very good WER ~ 50%

Valery813 avatar Mar 24 '21 10:03 Valery813

I am progressing this work. I will soon be releasing an upgraded version of Julius. Then I will focus on releasing training procedures for English language. You can always contact me on private email: silesiaresearch at gmail dot com

palles77 avatar Mar 24 '21 10:03 palles77

How are you doing, do you have anything to test ?)))

Valery813 avatar Apr 15 '21 09:04 Valery813