sentencepiece topic

List sentencepiece repositories

huggingface_libtorch

33
Stars
13
Forks
Watchers

Minimal example of using a traced huggingface transformers model with libtorch

RTranslator

6.6k
Stars
496
Forks
42
Watchers

Open source real-time translation app for Android that runs locally

konoha

214
Stars
21
Forks
Watchers

🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.

Tokenizer

268
Stars
66
Forks
Watchers

Fast and customizable text tokenization library with BPE and SentencePiece support

sentencepiece

24
Stars
6
Forks
Watchers

R package for Byte Pair Encoding / Unigram modelling based on Sentencepiece

gpt2-pytorch

19
Stars
3
Forks
Watchers

Extremely simple and understandable GPT2 implementation with minor tweaks

piecelearn

19
Stars
1
Forks
Watchers

Learning BPE embeddings by first learning a segmentation model and then training word2vec

vietnamese-roberta

27
Stars
5
Forks
Watchers

A Robustly Optimized BERT Pretraining Approach for Vietnamese

sentencepiece_chinese_bpe

109
Stars
16
Forks
Watchers

使用sentencepiece中BPE训练中文词表,并在transformers中进行使用。

sentencepiece

19
Stars
6
Forks
Watchers

Rust binding for the sentencepiece library