Knover
Knover copied to clipboard
Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle
做情感分类时做过随机mask和ngram的数据增强,请问对话任务,使用这种增强方式效果会好吗?还有其他有效的数据增强方式吗?
There is no __model__ in model file 24L/Plato, so it can not be translated into onnx. While NSP have __model__, why?

最近在拜读plato-2,有个疑问 在stage1.1 粗训的时候,生成1to1的mapping,这里使用的NLL损失,可是公式前面的`E`指的是什么呢?  stage2.1中代指的是从z的分布中取样得出一个z,这个能理解,可是stage1.1里面的E就不太懂了 
有关NSPModel训练
1)我看paper中的NSPModel,“To select the most appropriate responses generated by the fine-grained generation model, the evaluation model is trained to estimate the coherence of the responses.” 理解为用stage 2.1生成的候选 + label 做分类model 而代码中的...
你好,我在一些数据上重训nsp model,发现mask策略会使tgt_label采样为空。 具体在nsp_reader.py 的_pad_batch_records函数中 batch_mask_token_ids, tgt_label, tgt_pos, label_pos = mask( batch_tokens=batch_token_ids, vocab_size=self.vocab_size, bos_id=self.bos_id, eos_id=self.eos_id, mask_id=self.mask_id, sent_b_starts=batch_tgt_start_idx, labels=batch_label, is_unidirectional=False) 而mask策略,多次采样有时候prob 均> 0.15 ,导致mask_label、mask_pos都为空。 我在这块多次采样直到非空,暂时解决了这个问题。
Plato model infers error!!! The same config for train process is OK, but it fails for inferrence.
``` aistudio@jupyter-208728-1765888:~/Knover$ git branch -av develop dcf05a0 Support PaddlePaddle 2.0. * master 4bad22c Fix checkpoints and add document for continuous training (#31) remotes/origin/HEAD -> origin/develop remotes/origin/develop dcf05a0 Support PaddlePaddle 2.0....

在调用train.py时,batch_size可以设为8000左右,且一步用时在200s左右,而调用infer.py时,batch_size只能设的很小,4,12或更小,超过32就可能爆显存。这与平时的直观经验不一致啊。平时eval模式下应该比train模式下更快,占用内存也更小才对啊。请问是什么原因呢?