Min comments

Results 18 comments of

Min

Misleading map

I suggest splitting this chain into a three-way fork, with some shared skills between them.

A tool to generate training data

Will take a look into it! Much appreciated @pencoa :)

Hi @chesterkuo , is your result above obtained after this commit? f0c79cc93dc1dfdad2bc8abb712a53d078814a56 Thanks for reporting your results on CPU @ajay-sreeram! That is some impressive patience. I talked to the original...

Report the results

Hi @PhungVanDuy , it seems like you are using the latest commit. @jasonwbw mentioned in the previous comment that he used the model before commit https://github.com/NLPLearn/QANet/commit/f0c79cc93dc1dfdad2bc8abb712a53d078814a56. You could revert back...

Report the results

@PhungVanDuy can you post the error you get here?

Report the results

@jasonwbw looking at the error above, did you use the old trilinear function for the pretrained model above? It seems like the optimised trilinear function exponential moving average is missing.

Report the results

@PhungVanDuy comment out the old trilinear function as it is in the latest commit. Only use the optimied trilinear function and try again. It seems like you are trying to...

Report the results

Hi @webdaren , is there a reason why you've trained specifically for 35000 steps? All the listed results are based on models trained for 60k steps or longer.

layer normalization in layer?

Thanks for your question! You are right in that right now it looks more like an instance normalization rather than a layer normalization. If you check [this line](https://github.com/NLPLearn/QANet/blob/8107d223897775d0c3838cb97f93b089908781d4/layers.py#L69) I actually...

mask_logits in layer.py

You are correct, however, I believe that the difference is very minuscule (probably even smaller than a floating point epsilon) so I saw no point in fixing it.