awesome-align
awesome-align copied to clipboard
added multiprocessing to LineByLineTextDataset class
Added multiprocessing to LineByLineTextDataset class since tokenizer.prepare_for_model takes lot to time to process for large datasets