Neural-Network-p5 icon indicating copy to clipboard operation
Neural-Network-p5 copied to clipboard

Sigmoid derivative in nn.js

Open funderburkjim opened this issue 7 years ago • 3 comments

As I read it, your formula for the derivative of the sigmoid function is wrong in nn.js. You have

NeuralNetwork.dSigmoid = function(x) {
  return x * (1 - x);
}

but it should be

NeuralNetwork.dSigmoid = function(x) {
  var y = NeuralNetwork.sigmoid(x);
  return y* (1 - y);
}

Reference

The style and substance of your 'Coding Train' material is very enjoyable. 👍 Thank you.

funderburkjim avatar Jan 07 '18 21:01 funderburkjim

In order that the train method works properly with the sigmoid activation function and the corrected dSigmoid , two lines in the train method need to be changed.

old: 
var gradient_output = Matrix.map(outputs, this.derivative);
var gradient_hidden = Matrix.map(hidden_outputs, this.derivative);

new:
var gradient_output = Matrix.map(output_inputs, this.derivative);
var gradient_hidden = Matrix.map(hidden_inputs, this.derivative);

This is tricky. Your original code gave the correct answer with a sigmoid activation function, but would, I think fail with the tanh activation function; this is because the relation between the sigmoid function and its derivative is different than that between the tanh function and its derivative.

funderburkjim avatar Jan 08 '18 05:01 funderburkjim

Some other small suggestions:

  1. To test out the above, I used the 'copy' method of NeuralNetwork. In the course of this, I noticed error at line 59: this.lr = this.lr which should be this.lr = nn.lr.

  2. Your naming of variables follows those of the Book quite closely except at one point in train. You use variable names output_inputs and outputs where the book uses names final_inputs and final_outputs. It would make it slightly easier to compare your code to the book if you used the book variable names here also.

  3. At lines 77,78 of nn.js, you use the randomize method from the Matrix object in matrix.js.

this.wih.randomize();
    this.who.randomize();

Matrix.randomize() uses a function uses a function from p5.js:

this.matrix[i][j] = randomGaussian();

To follow the book more closely, I replaced this (in nn.js) with

      this.wih.nn_randomize_uniform(-0.5,0.5);
      this.who.nn_randomize_uniform(-0.5,0.5);

and added the method nn_randomize_uniform in matrix.js:

Matrix.prototype.nn_randomize_uniform(min,max) {
	// Written for nn.js  (ejf)
	// This follows the book.
	// Note: Alternatively, p.133 'Optional: More Sophisticated Weights'
 	for (var i=0;i<this.rows;i++) {
	    for (var j=0;j<this.cols;j++) {
		this.matrix[i][j] = (Math.random() * (max - min))  + min;
	    }
	}
     }

This has the virtue of following the book's default method, and also removing the dependence of nn.js on p5.js.

Hope you don't mind these minor nitpicks!

funderburkjim avatar Jan 08 '18 05:01 funderburkjim

This is wonderful, thank you so much for this detailed set of comments! I'm in the process of creating the video tutorials that correspond to this code so I'll work on adding these in as I go!

shiffman avatar Jan 10 '18 18:01 shiffman