sentencepiece topic
huggingface_libtorch
Minimal example of using a traced huggingface transformers model with libtorch
RTranslator
Open source real-time translation app for Android that runs locally
konoha
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Tokenizer
Fast and customizable text tokenization library with BPE and SentencePiece support
sentencepiece
R package for Byte Pair Encoding / Unigram modelling based on Sentencepiece
gpt2-pytorch
Extremely simple and understandable GPT2 implementation with minor tweaks
piecelearn
Learning BPE embeddings by first learning a segmentation model and then training word2vec
vietnamese-roberta
A Robustly Optimized BERT Pretraining Approach for Vietnamese
sentencepiece_chinese_bpe
使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。
sentencepiece
Rust binding for the sentencepiece library