Min

Results 18 comments of Min

I suggest splitting this chain into a three-way fork, with some shared skills between them.

Will take a look into it! Much appreciated @pencoa :)

Hi @chesterkuo , is your result above obtained after this commit? f0c79cc93dc1dfdad2bc8abb712a53d078814a56 Thanks for reporting your results on CPU @ajay-sreeram! That is some impressive patience. I talked to the original...

Hi @PhungVanDuy , it seems like you are using the latest commit. @jasonwbw mentioned in the previous comment that he used the model before commit https://github.com/NLPLearn/QANet/commit/f0c79cc93dc1dfdad2bc8abb712a53d078814a56. You could revert back...

@PhungVanDuy can you post the error you get here?

@jasonwbw looking at the error above, did you use the old trilinear function for the pretrained model above? It seems like the optimised trilinear function exponential moving average is missing.

@PhungVanDuy comment out the old trilinear function as it is in the latest commit. Only use the optimied trilinear function and try again. It seems like you are trying to...

Hi @webdaren , is there a reason why you've trained specifically for 35000 steps? All the listed results are based on models trained for 60k steps or longer.

Thanks for your question! You are right in that right now it looks more like an instance normalization rather than a layer normalization. If you check [this line](https://github.com/NLPLearn/QANet/blob/8107d223897775d0c3838cb97f93b089908781d4/layers.py#L69) I actually...

You are correct, however, I believe that the difference is very minuscule (probably even smaller than a floating point epsilon) so I saw no point in fixing it.