minbpe Huggingface already has an efficient implementation of this?

Huggingface already has an efficient implementation of this?

Open laurislopata opened this issue 11 months ago • 3 comments

When Karpathy claimed an efficient implementation of the BPE optimizer doesn't exist, I did some research and found this on Hugging Face: https://github.com/huggingface/tokenizers/blob/main/tokenizers/src/models/bpe/trainer.rs

Isn't this exactly what Karpathy was creating?

Mar 19 '24 23:03 laurislopata

minbpe minbpe copied to clipboard

Huggingface already has an efficient implementation of this?

minbpe
minbpe copied to clipboard