deepspeech.torch icon indicating copy to clipboard operation
deepspeech.torch copied to clipboard

Regarding the Language Model used

Open SarthakYadav opened this issue 7 years ago • 7 comments

DeepSpeech paper [1] says that at inference time CTC models are paired with a language model.

Which language model does this implementation use? Where is the language model written/stored/called in the code?

How can I use my own language model with the network?

SarthakYadav avatar Apr 05 '17 10:04 SarthakYadav

This is equivalent to the implementation before adding the language model. If you wanted to replicate the paper, you would spit out the top 1000 beams from the DS model and rescore them with a LM. (This would take a bit of extra work of course...)

mtanana avatar Apr 05 '17 15:04 mtanana

Yeah I "actually" went through the code this time and already realized there isn't an LM.

Oh. That sounds alright. Can you give me some pointers on how to

  1. spit the top 1000 beams? (I am new to Torch.)
  2. Is there any pre-built LM for torch / interface to some existing Language model, say, kenlm??

SarthakYadav avatar Apr 05 '17 16:04 SarthakYadav

  1. This should be easier than other problems because you can just take the outcome probabilities and walk through them saving the top 1000 at each step. (This will be a bit like doing Viterbi search if you're familiar...) In other words, you won't have to ever re-run the lower parts of the model, since each time t doesn't depend on the output for t-1

  2. There are some neural LM's for torch https://github.com/mtanana/torchneuralconvo https://github.com/karpathy/char-rnn but these will be a lot slower to run over 1000 examples than an n-gram model, which is was the DS paper used. But you'd have to write a crosswalk to someone's code...

mtanana avatar Apr 05 '17 17:04 mtanana

And note...the paper mentioned some weighting between the score from the DS model and the score from the LM....wasn't clear if this was estimated or set like a hyper-param

mtanana avatar Apr 05 '17 17:04 mtanana

I'm still playing with the base model code, but once I get better results, I'd be happy to help with this part...but I'm a month or two from where I'll have time...

mtanana avatar Apr 05 '17 17:04 mtanana

I won't be able to implement the language model due to time constraints but it definitely is a large part to the project and would improve the model's performance substantially. A lot of reference can be found in the original deepscribe 2 paper

SeanNaren avatar Apr 19 '17 17:04 SeanNaren

Dear all, I am trying to integrate the LM in the calculation of final score estimated by the neural network. I am new to Torch, could anyone give me a start point to get the outcome probabilities from the neural network? Thanks in advance.

menamine avatar May 11 '18 14:05 menamine