BlonDe
BlonDe copied to clipboard
Questions about sentence-level alignment of BWB
Hi,
Thanks for your fine work, and new large-scale document-level data. But when I do sentence-level alignment using the " <sep> " symbols as delimiters, it doesn't quite align well.
I try to check the file in training data, for example, in the first line of 282.enu/chs.
Chapter 1 - Birthday Gift <sep> 001- Birthday gift Lu Mingshu would always remember her seventh birthday. <sep> **** 陆明舒一直记得七岁生辰那天。 <sep> ****
"Chapter 1 - Birthday Gift" 's translations do not appear in Chinese article.
Looking forward for your reply Thanks!