tokenizer topic
Chiffon
A small ECMAScript parser, tokenizer and minifier written in JavaScript.
thot
Thot toolkit for statistical machine translation
spacy-experimental
🧪 Cutting-edge experimental spaCy components and features
wink-tokenizer
Multilingual tokenizer that automatically tags each token with its type
python-vncorenlp
A Python wrapper for VnCoreNLP using a bidirectional communication channel.
DumbLuaParser
Lua parsing library capable of optimizing and minifying code.
JPOPHP
JSON Parser Object PHP is a library for parsing the data in JSON format.
greeb
Greeb is a simple Unicode-aware regexp-based tokenizer.
tokenizer
NLP tokenizers written in Go language