minbpe
minbpe copied to clipboard
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Results
61
minbpe issues
Sort by
recently updated
recently updated
newest added
Instead of computing stats from scratch for every merge, we can calculate it once and update it during `merge`. This results in reduced computation as we update the stats dictionary...