tokeniser topic

List tokeniser repositories

ucto

63
Stars
13
Forks
Watchers

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to...

Tokenize2

82
Stars
25
Forks
Watchers

Tokenize2 is a plugin which allows your users to select multiple items from a predefined list or ajax, using autocompletion as they type to find each item. You may have seen a similar type of text ent...

tok-tok

28
Stars
3
Forks
Watchers

A fast, simple, multilingual tokenizer

taibun

23
Stars
1
Forks
Watchers

Taiwanese Hokkien Transliterator and Tokeniser