minbpe icon indicating copy to clipboard operation
minbpe copied to clipboard

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Results 61 minbpe issues
Sort by recently updated
recently updated
newest added

Instead of computing stats from scratch for every merge, we can calculate it once and update it during `merge`. This results in reduced computation as we update the stats dictionary...