NeMo-Curator icon indicating copy to clipboard operation
NeMo-Curator copied to clipboard

FastTextQualityFilter model file release

Open simplew2011 opened this issue 1 year ago • 1 comments

not found FastTextQualityFilter model weight file, how to download it.

simplew2011 avatar Jun 18 '24 10:06 simplew2011

Hello! We don't provide a model for you to use, but we do demonstrate how to train your own model. All you need is a low quality data source (like unfiltered Common Crawl snapshots) and a high quality data source (like Wikipedia) and you can follow this example script.

ryantwolf avatar Jun 24 '24 17:06 ryantwolf