Toy-Neural-Network-JS
Toy-Neural-Network-JS copied to clipboard
Adjust at the end
Hi,
I don't know if it was already said, but it is still not fixed in the code. When you are training your model, you calculate the weights_deltas between hidden layer and output layer. But you are updating your weights too early, you are using the new weights to calculate deltas for previous weights (for back-propagation). You have to keep deltas in memory then update at the end after the rest.
That's why your model takes a lot of time to be trained for the XOR problem. It shouldn't be that long. By memory you have took 50000 iterations. That is too big for a problem like this.
Keep going ^^.
P.S.: Sorry if my english is not perfect, it is not my main natural language.
I wonder if gradients should be updated in the same way?
Sorry, I don't know the algorithm well
I don't think it matters, or i didn't understand your comment... sorry ^^
For the algorithm. order is important, because if your weights is updated too early the following steps for back propagation is disturbed. For the example of a 3 layers neural networks. It doesn't really matter, but for a deeper nn, this is very important...
:)
Based on the way you mention earlier, I assume that the calculation is done as follows
deltas = []
for layer of layers:
delta = calculateDelta()
deltas.append(delta)
calculateGradient()
bias.addGradient()
for weight in weight
weight.add(delta)
Am I right?
Yes, but bias must be treated in the same way that weights, for the same reason.
And you suppose that gradient and deltas are different but delta are calculated using gradient. (just in case that was not clear)
So now :
deltas_w = []
deltas_bias = []
for layer of layers:
gradient = calculateGradien()
delta = calculateDelta(gradient)
deltas_w.append(delta)
deltas_bias.append(gradient) // According to the model
// Calculated with learning rate and multiplied by -1 for the descent part
for weight in weight
weight.add(delta)
for b in deltas_bias
bias.add(b)
This algorithm may be wrong, and may only work for a convolutional neural network.
Hope it helps
but bias must be treated in the same way
That's what I meant earlier. Thanks for the info
Your welcome