corpus-processing topic

List corpus-processing repositories

Wordless

673
Stars
88
Forks
Watchers

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

bitextor

287
Stars
43
Forks
Watchers

Bitextor generates translation memories from multilingual websites

TreebankPreprocessing

162
Stars
43
Forks
Watchers

Python scripts preprocessing Penn Treebank and Chinese Treebank

OPIEC

36
Stars
6
Forks
Watchers

Reading the data from OPIEC - an Open Information Extraction corpus

corpuslingr

21
Stars
1
Forks
Watchers

A library of functions enabling complex corpus search in context (KWIC), search aggregation, bag-of-words building & keyphrase extraction.

alvisnlp

16
Stars
6
Forks
Watchers

ALvisNLP corpus processing engine

corpusexplorer2.0

20
Stars
3
Forks
Watchers

Korpuslinguistik war noch nie so einfach...