berkeleylm icon indicating copy to clipboard operation
berkeleylm copied to clipboard

Frequency Map

Open GoogleCodeExporter opened this issue 9 years ago • 1 comments

Good Afternoon,

How to generate a map of frequency of n-grams?

Thank you.

Original issue reported on code.google.com by [email protected] on 8 Dec 2014 at 4:48

GoogleCodeExporter avatar Jul 16 '15 16:07 GoogleCodeExporter

Depends what you mean. If you want to get frequency of n-grams from raw text, 
we don't support that. If you want an efficient in-memory representation of 
some n-gram counts that you have already put in Google n-grams format, then you 
can build a StupidBackOffLm and get access to the underlying counts using:

https://code.google.com/p/berkeleylm/source/browse/trunk/src/edu/berkeley/nlp/lm
/StupidBackoffLm.java#132

See here for an example:

https://code.google.com/p/berkeleylm/source/browse/trunk/src/edu/berkeley/nlp/lm
/io/MakeNgramMapBinaryFromGoogle.java#40
'
Let me now if you need further help.

Original comment by [email protected] on 9 Dec 2014 at 1:35

GoogleCodeExporter avatar Jul 16 '15 16:07 GoogleCodeExporter