parallel-corpora-tools
parallel-corpora-tools copied to clipboard
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
Results
1
parallel-corpora-tools issues
Sort by
recently updated
recently updated
newest added
Remove sentences where the number of non-space characters is equal (or very close?) to the number of tokens. English > ( c o n t i n u a t...