kenlm icon indicating copy to clipboard operation
kenlm copied to clipboard

Add access to vocabulary in python bindings

Open cypreess opened this issue 10 years ago • 4 comments

It would be nice to have access to kenlm.LanguageModel.vocab or even (maybe more pytonic way) to support iterable protocol on kenlm.LanguageModel.

cypreess avatar Mar 25 '15 08:03 cypreess

Would a callback from LoadVirtual be sufficient?

kpu avatar Mar 26 '15 14:03 kpu

The C++ side doesn't even remember the vocabulary strings by default because users either don't need it or have their own data structure populated by the EnumerateVocab callback API.

kpu avatar Mar 26 '15 15:03 kpu

I must say I did not read very deeply into the implementation. Just wondering if it's easy to implement access vocabulary somehow.

cypreess avatar Mar 26 '15 15:03 cypreess

@kpu Is there any way we can access LanguageModel vocab from python wrapper. I am loading model as kenlm.Model(model.klm) in python. "model.klm" is built from command line.

manishbansal-fk avatar Jan 09 '18 12:01 manishbansal-fk