parallel-corpus topic
indonesian-NLP-resources
data resource untuk NLP bahasa indonesia
Classical-Modern
非常全的文言文(古文)-现代文平行语料
banglanmt
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Pr...
Cross-Language-Dataset
A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection
OpusFilter
OpusFilter - Parallel corpus processing toolkit
nepali-translator
Neural Machine Translation on the Nepali-English language pair
Indian_ParallelCorpus
Curated list of publicly available parallel corpus for Indian Languages
bertalign
Multilingual sentence alignment using sentence embeddings
astred
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text,...
TALPCo
TUFS Asian Language Parallel Corpus