word-segmentation topic
word_tokenize
Vietnamese Word Tokenize
hellonlp
NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现
WordSegmentationDP
Word Segmentation with Dynamic Programming
hashformers
Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).
spell
Spelling correction and string segmentation written in Go
sentencepiece-jni
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
nlp-roadmap
🗺️ 一个自然语言处理的学习路线图
SymSpellCppPy
Fast SymSpell written in c++ and exposes to python via pybind11
TrTokenizer
🧩 A simple sentence tokenizer.