Zhixing Tan comments

Results 14 comments of


                                            Zhixing Tan

只做deocding的时候Input 文件的格式

我们的系统不能预测predicate。He的[deep_srl](https://github.com/luheng/deep_srl)可以进行predicate以及srl的同时预测。

How to configure early stopping?

The validation is done by `run.sh` which is evaluated every 300 seconds by default. Make sure `run.sh` can work properly.

How to configure early stopping?

Normally, there would be a file named `log` which records the F1 score of each validation. For example: ``` model.ckpt-2246: 24.330000 model.ckpt-4470: 49.570000 model.ckpt-6699: 59.640000 model.ckpt-8919: 65.980000 ``` If this...

在FFN中的问题

3D的输入可以看作1 * 1的卷积，实际上都是线性映射，只是使用convolution避免过多的reshape操作。

Decoding seems to be reloading the model...?

Sorry for the late reply. 1. The reloading problem is caused by the old API of `tf.contrib.learn.Estimator`. It can be replaced with `tf.estimator.Estimator` which do not have this problem. However,...

use cpu to inference

Unfortunately, the PyTorch implementation currently does not support CPU for inference.

使用IWSLT17中-英数据集，在训练过程中BLEU持续升高，没有收敛的迹象，但模型在测试集上的泛化能力很差

首先，不知道这里开发集用的是什么。IWSLT数据集是口语的数据集，并且规模较小，newstest是新闻的数据集，这两个领域差距很大，newstest上BLEU低是可以理解的。训练过程中的BLEU一般是算的BPE后的BLEU而非tokenize后的BLEU，这个值一般会偏高。

如何正确的加入预训练的词向量

预训练词向量一般通过initializer添加，只在初始化的时候赋值。当存在保存的checkpoint时，初始化的参数会被checkpoint中的参数覆盖。在验证时会恢复之前保存的checkpoint，按说不会加载之前的词向量。

Do you have an instruction manual for the pytorch version?

In the above example, you should set `shared_embedding_and_softmax=true` instead of `shared_embedding_and_softmax=True`. The document of PyTorch implementation will be uploaded soon. We have tested our implementation on several datasets, but we...