introduction_to_ml_with_python icon indicating copy to clipboard operation
introduction_to_ml_with_python copied to clipboard

Use NMF.inverse_transform when reconstructing the dataset

Open qinhanmin2014 opened this issue 7 years ago • 1 comments

In notebook 03-unsupervised-learning

X_train, X_test, y_train, y_test = train_test_split(
    X_people, y_people, stratify=y_people, random_state=0)
nmf = NMF(n_components=100, random_state=0)
nmf.fit(X_train)
pca = PCA(n_components=100, random_state=0)
pca.fit(X_train)
kmeans = KMeans(n_clusters=100, random_state=0)
kmeans.fit(X_train)

X_reconstructed_pca = pca.inverse_transform(pca.transform(X_test))
X_reconstructed_kmeans = kmeans.cluster_centers_[kmeans.predict(X_test)]
X_reconstructed_nmf = np.dot(nmf.transform(X_test), nmf.components_)

Maybe we can change

X_reconstructed_nmf = np.dot(nmf.transform(X_test), nmf.components_)

to

X_reconstructed_nmf = nmf.inverse_transform(nmf.transform(X_test))

This will be more consistent with pca and I guess it's better to rely on scikit-learn API

qinhanmin2014 avatar Dec 17 '18 02:12 qinhanmin2014

Yeah, good idea. I had opened the issue for adding this https://github.com/scikit-learn/scikit-learn/issues/6118 as a reaction to writing this code ;)

So when I wrote this, the feature wasn't there, but now it can be fixed.

amueller avatar Dec 17 '18 18:12 amueller