tokenization topic
php-text-analysis
PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
razdel
Rule-based token, sentence segmentation for Russian language
vibrato
🎤 vibrato: Viterbi-based accelerated tokenizer
l8w8jwt
Minimal, OpenSSL-less and super lightweight JWT library written in C.
charformer-pytorch
Implementation of the GBST block from the Charformer paper, in Pytorch
vaulty
Tokenize, encrypt/decrypt, mask your data on the fly with Vaulty proxy
dlp-dataflow-deidentification
Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
wink-tokenizer
Multilingual tokenizer that automatically tags each token with its type
TweebankNLP
An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
nlp-cheat-sheet-python
NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition