DeepLearningPython
DeepLearningPython copied to clipboard
np.dot(w, activation) throws an error in backprop(x, y)
This code
for b, w in zip(self.biases, self.weights):
z = np.dot(w, activation)+b
zs.append(z)
activation = sigmoid(z)
activations.append(activation)
throws the following error, and I don't really know why. (I am using Python 3.8.3 and numpy 1.18.4)
ValueError: operands could not be broadcast together with shapes (10,784) (10,30)
Note also that VS Code says that the shapes of w and activation were (10, 30) and (30, 784) respectively. This seems to be different from the shapes in the error.
I can confirm on Python 3.8.2 with Numpy 1.18.3 that this error occurs (albeit with recorded sizes of (30, 1) and (10, 1)), with both the 2.x and 3.x versions of the code.
The line in question is line 123 of network.py.
me too! any fixes?
I'm late to the game, but I'm curious for other people if they were using the load_data
function instead of the load_data_wrapper
function from the mnist_loader
file.
I'd be surprised if others made the same mistake that I did, but I thought I had the same problem in this ticket for a while until I realized my mistake. That's what was going wrong for me.
I just think a problem big enough that the vectors can't match for dot products is a good indicator that it's probably a math logic problem and not a python/numpy version issue.
I'm using Python 3.6.9 and Numpy 1.19.1
I'm late to the game, but I'm curious for other people if they were using the
load_data
function instead of theload_data_wrapper
function from themnist_loader
file.
training_data, validation_data, test_data = load_data()
returns
ValueError: shapes (16,2296) and (50000,784) not aligned: 2296 (dim 1) != 50000 (dim 0)
and
training_data, validation_data, test_data = load_data_wrapper()
returns
ValueError: setting an array element with a sequence.
I also wrote my own function to load and format the data and that throws
ValueError: shapes (16,2296) and (784,) not aligned: 2296 (dim 1) != 784 (dim 0)
All 3 are complaining about line 101 (z = np.dot(w, activation) + b
)
and I can't seem to find the issue with either?!
fixed.
change
np.dot(delta, activations[-2].transpose())
to
np.outer(delta, activations[-2].transpose())
The shape of the matrices are (a, )
and (b, )
, respectively. When we use np.dot
while the matrices are 1d both, it cannot give us a result of axb
. Use np.outer
instead to get what we want.