Document-Transformer
Document-Transformer copied to clipboard
Improving the Transformer translation model with document-level context
数据里的预处理
您好: 我是北京大学的一名学生,正在研究document_nmt这部分,想请问您,如果方便的话,是否可以提供论文中提到的数据集呢?此外,想请问下,对于数据集的预处理部分,可以开放看看么,对于中文该进行哪些预处理呢~ 期待您的回复~ 祝好~
Comparing with training of using corpus of 2M ch-en, when I use corpus of 940k ch-en to train model, what parameters should I use ?I have tried to use batch_size=25k,...
When I use context-level model to decoding and test, is parameter MODEL-PATH a folder that include all models or a model file that is one model? if it is former,when...
hello~ When I use this code to training a model, What format should be processed for the source corpus, the target corpus, the context corpus? are they tokenized and BPE?...
@Glaceon31 Thank you in advance!
您好,现目前我已将Transformer模型训练好,需要进一步按照readme文件中的指令训练,但是存在的问题是不知道source corpus、target corpus以及context corpus文件里面的具体格式,如果过可以的话,是否可以提供论相关文件呢,期待您的回复,祝好,我的邮箱为[email protected]