Di Jin

Results 62 comments of Di Jin

Thank you for helping debug! @Imposingapple

Have you initialized the model with pre-trained parameters from MASS?

yeap, the initial perplexity shouldn't be that high. have you noticed any warning message about model initialization, something like "modules that are not initialized"?

Have you run "evaluate_mix_CNN_NYT_X.sh"? You should use this file for final evaluation.

I have the same question, without large effective batch size (for batch normalization) like over 32, it is hard to reproduce the results in the original paper, and pytorch is...

I think the pre-trained model should be the correct one. Could you double check whether the model parameters are initialized correctly?

hi, the premise_mask and hyp_mask are used to indicate the positions of the premise and hypothesis in a sequence. Usually the premise and hypothesis are concatenated together into a sequence...

The premise_mask and hyp_mask can be created together with the input mask, in which we tokenize the sequence and convert them into IDs.

1. Premise should be the concatenation of passage and question, while hypothesis is the answer 2. The length of both premise and hypothesis masks should be the same as the...

Nope, but you can use 60 out f 100 score as a reference score, which is the passing score for the Med Exam.