agupta74

Results 2 issues of agupta74

What hyper-parameter settings (learning rate, batch size etc.) have been used for fine-tuning Albert-v2 module on MNLI task? I am seeing accuracy of ~82.6 as compared to 84.6 reported in...

The final softmax prob. values are not same if the padding amount changes. It looks like that for some of the functions such as reduce_mean and reduce_max the padding is...