go-ngram
go-ngram copied to clipboard
Ngram index for golang
go-ngram data:image/s3,"s3://crabby-images/e23e8/e23e851444a4de2a823679abc889f2e4f41238cb" alt="Build Status"
N-gram index for Go.
Key features
- Unicode support.
- Append only. Data can't be deleted from index.
- GC friendly (all strings are pooled and compressed)
- Application agnostic (there is no notion of document or something that user needs to implement)
Usage
index, err := ngram.NewNGramIndex(ngram.SetN(3))
tokenId, err := index.Add("hello")
str, err := index.GetString(tokenId) // str == "hello"
resultsList, err := index.Search("world")
TODO:
- Smoothing functions (Laplace etc)