Machine-Learning-Explained icon indicating copy to clipboard operation
Machine-Learning-Explained copied to clipboard

Bug?

Open jakub-jurek opened this issue 1 year ago • 0 comments

Hello, I think there might be a bug in your linear_discriminant_analysis.py script You are computing

Between class scatter matrix

    total_mean = np.mean(X, axis=0)
    S_B = np.empty((n_features, n_features))
    for label in labels:
        _X = X[y == label]
        _mean = np.mean(_X, axis=0)
        S_B += len(_X) * (_mean - total_mean).dot((_mean - total_mean).T)

Which gives a NxN S_B matrix, where N is the number of features. But with

_mean = np.mean(_X, axis=0)

the _mean has shape (N, ), which makes (_mean - total_mean).dot((_mean - total_mean).T) a scalar instead of a matrix. S_B keeps to have correct shape because of its initialization, but incorrect values.

I might be wrong, but please consider this.

jakub-jurek avatar Nov 07 '23 12:11 jakub-jurek