About usage of chinese_text.txt.enzh in _processed_data and translation mismatch to the example image
Hi,
I would like to ask why there are chinese_test.txt and also chinese_text.txt.enzh, chinese_text.txt.enzh seems to be extracted only even number of indices from chinese_text.txt? Which one should I used for testing?
Besides, the example image translate "hold our course" to 保持我们的航向, but in chinese_text.txt translate it to 坚持我们的课程. Is that a mistake or I am finding a wrong file?
Thank you for your attention.
Using chinese_text.txt.enzh for testing. The even number of indices from chinese_text.txt is for en->zh direction and the odd number of indices from chinese_text.txt is for zh->en direction, both of which consist of a complete dialogue. We may upload the wrong version before human checking. Thanks for your attention.
Using chinese_text.txt.enzh for testing. The even number of indices from chinese_text.txt is for en->zh direction and the odd number of indices from chinese_text.txt is for zh->en direction, both of which consist of a complete dialogue. We may upload the wrong version before human checking. Thanks for your attention.
Why split this test file in two direction instead of the whole file for testing?