tokenizers
tokenizers copied to clipboard
Adding multiprocessing for sentencepiece_extractor
Goal: To speedup merges extraction for the Sentence Piece
Adding multiprocessing for sentencepiece_extractor code to speed-up the merges extraction process. Since for 128K vocabulary more than an hour of time is consumed.