ltlangpack icon indicating copy to clipboard operation
ltlangpack copied to clipboard

Tools for Lithuanian language processing

Lithuanian language processing tools to be used in NLP, search or other applications.

Sentence detection

Folder: sentence-detect

OpenNLP model for Lithuanian sentence detection.

Scripts to help with building the model:

  • add - append new text into the model (see comment inside the script)
  • train - build model based on example corpora
  • evaluate - evaluate detection quality

Snowball

Snowball version of Porter stemmer for Lithuanian language was moved to this page.

Language detection

Folder: language-detect

N-grams for Lithuanian language detection. Used in Apache Tika https://issues.apache.org/jira/browse/TIKA-582

License

Copyright (C) 2011 UAB TokenMill

Distributed under the Eclipse Public License.