jinming

Results 10 issues of jinming

Hi, When I download the data.zip and will prompt the error. Now I have the multinli_0.9 and snli datasets, but don't have the saved_embed.pt. So, I want to know the...

In the framework, before the BiLSTM layer(context representive and aggragation layer), there are a highway network. So, I want to know the reason that add this highwat layer?

遇到一个问题, 在full match layer的输出中 , 论文中: sentence1_match_output shape = batchsize * len(sentence2) * L (num perspective) sentence2_match_output shape = batchsize * len(sentence1) * L (num perspective) 然后作为 aggregation 中的 lstm...

Hi, https://github.com/ne7ermore/deeping-flow/blob/master/reinforced-translate/model.py b_words = model.sample(self.prev) s_words, s_props = model.sample(self.prev, False) rewards = self.compute_levenshtein(model.tgt, s_words) baseline = self.compute_levenshtein(model.tgt, b_words) advantage = rewards - baseline 其中的 b-words 和 s-words 都是 Tensor 类型,...

Hi, https://github.com/ne7ermore/deeping-flow/blob/master/reinforced-translate/model.py#L175 mask = pad_mask(model.tgt, EOS, [args.batch_size, args.max_len]) 这里是不是应该是 MC 采样的作为target? 应该是s_words? Thanks

Hi, Thank you for sharing your code and models! I need to use you code and models on my own video data for other tasks. My own videos are the...

/data/pretrained_model/glm-10b-chinese-mp4/80000/mp_rank_03_model_states.pt. Traceback (most recent call last): File "pretrain_glm.py", line 673, in main() File "pretrain_glm.py", line 584, in main args.iteration = load_checkpoint(model, optimizer, lr_scheduler, args, no_deepspeed=args.no_deepspeed_load) File "/data/users/zhaojinming/source/glm10BCodesSlurmN1/utils.py", line 339, in...

您好, 下面是几个阶段的训练输出,center-loss 下降的特别快,而softmax loss 基本不动,随着训练进行,centerloss 逐渐增加,softmax loss 逐渐下降,在其他的数据集上训练过程也是如此,这样正常吗,能否解释一下这个过程的原因? (正常情况下 l2 loss 也是这样,一般先降 l2 loss 然后再降 softmax loss, 这样训练就特别慢, 您能帮忙解释一下不?) 。 谢谢! step: 0, training accuracy: 0.01, training loss: 9.13, center_loss_value:...

Hi, Thank for your released data and code. In this paper, you use the input message and scene info to decide the source-entity and use the response to decide the...