fengxin619

Results 14 comments of fengxin619

I find the same problem ,how to solve it?

> loss = F.nll_loss(output[1:].view(-1, vocab_size), trg[1:].contiguous().view(-1), ignore_index=pad) > > The loss computed by the above line is the average at every time step, which can cause it difficult to train...

> @rose-jinyang We have updated our code base to produce a TF saved model. Both API [export_graph](https://github.com/onnx/onnx-tensorflow/blob/master/doc/API.md#onnx_tfbackend_reptensorflowrepexport_graph) and CLI [convert](https://github.com/onnx/onnx-tensorflow/blob/master/doc/CLI.md#convert) will produce a saved model for you. which version will...

> 没有错位,你再仔细考虑考虑,predictions最后是sep,这个sep对应的输出是没有意义的。 哇,这么快回复。 但是target_mask是丢掉了首位?是[CLS]对应的位置?

> 0, 0, 0, 0, 1, 1 厉害了大佬...脑筋急转弯我学会了!

> 对,如果句子是 [cls, 1, 2, sep, 3, 4, sep] 那么prediction输出则是看[sep, 3, 4] 这几个 token的结果,因此屏蔽掉[cls, 1, 2],就是利用了target_mask。这个句子对应的token_type_id=[0, 0, 0, 0, 1, 1, 1],从第二位开始取,就是[0, 0, 0, 1, 1, 1],prediction输出是[cls, 1, 2, sep,...

> 你觉得应该怎么取呢? target_ids_padded = token_ids_padded[:, :-1].contiguous() 这样?.....求拍醒。

> 对,如果句子是 [cls, 1, 2, sep, 3, 4, sep] 那么prediction输出则是看[sep, 3, 4] 这几个 token的结果,因此屏蔽掉[cls, 1, 2],就是利用了target_mask。这个句子对应的token_type_id=[0, 0, 0, 0, 1, 1, 1],从第二位开始取,就是[0, 0, 0, 1, 1, 1],prediction输出是[cls, 1, 2, sep,...

@Fuckmi Could u tell me that how did you change the linear connect layer in the discriminate layer???