Language Machines

Results 5 repositories owned by Language Machines

ucto

63
Stars
13
Forks
Watchers

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to...

frog

73
Stars
11
Forks
Watchers

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.

LuigiNLP

21
Stars
4
Forks
Watchers

A workflow system for Natural Language Processing.

PICCL

46
Stars
6
Forks
Watchers

A set of workflows for corpus building through OCR, post-correction and normalisation

timbl

46
Stars
9
Forks
Watchers

TiMBL implements several memory-based learning algorithms.