tokenizer topic
omnicat-bayes
Naive Bayes text classification implementation as an OmniCat classifier strategy. (#ruby #naivebayes)
suika
Suika 🍉 is a Japanese morphological analyzer written in pure Ruby
bredon
A modern CSS value compiler in JavaScript
snapdragon-lexer
Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.
psr2r-sniffer
A PSR-2-R code sniffer and code-style auto-correction-tool - including many useful additions
python-vaporetto
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
nlpo3
Thai Natural Language Processing library in Rust, with Python and Node bindings.