kaldi-gstreamer-server RNN LM Rescoring

Hi, I just wanted to ask, is there any way to rescore the lattice using recurrent neural network language model with the current kaldi-gstreamer setup?

Thanks a lot!

Jun 08 '17 08:06 jin1004

It's theoretically possible using the post-processing framework and n-best lists, but it would be quite complicated.

Jun 08 '17 09:06 alumae

@jin1004 I'm not sure what RNN library you're using but if it has python bindings you could maybe do this--

create a new version of the sample_full_post_processor.py file
import the rescoring method from your rnnlm library (and any other dependencies for it) in that file. In this example we will call the method "rescore_sentence" and assume that it returns a likelihood for the given sentence
find the line in the post_process_json method that reads "if len(event["result"]["hypotheses"]) > 1:" and insert a line after it
on that line write this (all on one line): event['result']['hypotheses'] = sorted(event['result']['hypotheses'],key=lambda x : rescore_sentence(x['transcript']),reverse=True)
save the file
in your xxx.yaml file replace "sample_full_post_processor.py" with the name of your xxx.yaml
save that file and make sure your worker is using it

it should now return the hypothesis transcript with the highest likelihood using your rnnlm. That's probably the simplest way to do it, although you would obviously need to do checks to make sure the model is loaded right, the library can be imported, etc. so that it won't break the whole program.

If you are using a library without python bindings (I don't think faster-rnnlm has any for example) then you would probably need to spawn a subprocess or something which would slow things down some and potentially introduce additional complications. If speed isn't an urgent concern then you could do it that way.

I just made this off the cuff so if there are any problems or i misunderstood what you are trying to do let me know.

Jul 12 '17 09:07 calderma

@calderma Thank you so much! My RNN language model is still in training. I will test it with the method you described and let you know if it works.

Jul 13 '17 04:07 jin1004

@jin1004 Did the proposed solution of @calderma work?

Jan 22 '19 12:01 Umar17

Does anyone have a solution to this?

Sep 29 '22 16:09 LiamLonergan

kaldi-gstreamer-server kaldi-gstreamer-server copied to clipboard

RNN LM Rescoring

kaldi-gstreamer-server
kaldi-gstreamer-server copied to clipboard