chanhou

Results 2 issues of chanhou

The tanh in the hidden layer is unnecessary since the head param need the result of hidden layer instead of transform by tanh. Secondly, by using lower learning rate can...

Hi firstly appreciate to your contribution and I am playing your code with the model you produce. When I trying to produce a similar model of copy.mdl using without l2...