Charlie Gao

Results 1 comments of Charlie Gao

Is there any update on this issue? Seems like BPE tokenizers get stuck at `Computing merges` on long training corpus.