tokenizer topic
python-mecab
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
esperanto-analyzer
Morphological and syntactic analysis of Esperanto sentences
transphone
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
Lisp-esque-language
💠The Lel programming language
guide-to-interpreters-series
Contains source-code for viewers following along with my Beginners Guide To Building Interpreters series on my Youtube Channel.
Loretta
A C# Lua, GLua and Luau parser, code analysis, transformation and generation library.
python-vibrato
Viterbi-based accelerated tokenizer (Python wrapper)
tiptap-annotation-magic
An extension for the Tiptap editor, enabling the annotation of text. Comes with support for overlapping annotations, useful for e.g. NLP tokenization.
tivars_lib_cpp
A C++ library to interact with TI-z80 (82/83/84 series) calculators files (programs, lists, matrices, etc.)