Machine-Learning-Explained
Machine-Learning-Explained copied to clipboard
Bug?
Hello, I think there might be a bug in your linear_discriminant_analysis.py script You are computing
Between class scatter matrix
total_mean = np.mean(X, axis=0)
S_B = np.empty((n_features, n_features))
for label in labels:
_X = X[y == label]
_mean = np.mean(_X, axis=0)
S_B += len(_X) * (_mean - total_mean).dot((_mean - total_mean).T)
Which gives a NxN S_B matrix, where N is the number of features. But with
_mean = np.mean(_X, axis=0)
the _mean has shape (N, ), which makes (_mean - total_mean).dot((_mean - total_mean).T) a scalar instead of a matrix. S_B keeps to have correct shape because of its initialization, but incorrect values.
I might be wrong, but please consider this.