sentencepiece
sentencepiece copied to clipboard
can we train by Parallel Computing or Multithreading or multi-Progress
can we train by Parallel Computing or Multithreading or multi-Progress? Speed up training thank you
@joytianya Yes, we can! For example, look at YouTokenToMe. This BPE implementation quite efficiently uses parallel processing.
Thank you. I will take a look. Actually, the current BPE algorithm is a little conservative to find the most frequent pairs.
Will work on it in the next release.
I am really looking forward to parallel training, as running Asian language corpora on multi-core computers is extremely slow, making me feel like I am wasting my CPU...
Will work on it in the next release.
Hi @taku910 - I was wondering whether this feature was released. Thank you