Derivative of the activation function.

Open rmarchesini opened this issue 8 years ago • 0 comments

Hi, my name is Ramiro, I was checking the code and I have a doubt. When you update the parameters, related to the input layer and the hidden layer (W1,b1), you calculate the derivative of the activation function, I think that it is done in this line (ann.py file): dZ = pY_T.dot(self.W2.T) * (1 - Z*Z) # tanh In the particular case of the tanh I think that (1 - Z*Z) is the derivate, if this is correct so why we use Z. Recall what is stored in Z: Z = np.tanh(X.dot(self.W1) + self.b1) I think that we should use only X.dot(self.W1) + self.b1 to evaluate the the derivative, which is the same that use np.arctanh(Z). So the result should be (1 - np.arctanh(Z)*np.arctanh(Z)). I'm probably wrong, just want to know why.

Thanks! R.

Nov 23 '17 16:11 rmarchesini