yellowbrick
yellowbrick copied to clipboard
WIP: Allows Silhouette Visualizer to accept DensityEstimator
This PR fixes #1303, which reported that they could not use GMM as a clustering model with Silhouette Visualizer.
They received this traceback:
yellowbrick.exceptions.YellowbrickTypeError: The supplied model is not a clustering estimator; try a classifier or regression score visualizer instead!
Once I resolved the above issue, I encountered another problem with GMM not having a n_clusters attribute on the estimator.
I have made the following changes:
- I added a new
is_densityfunction to the utils/types file - I then used the
is_densityfunction with the ClusteringScoreVisualizer Class to allow for DensityEstimators to be used by this class ~3. I fixed the attribute error by using a try/except clause to setself.n_clusters_equal toself.estimator.n_componentsin silhouette.py file~ - Checked if self.estimator has the n_components attribute that the Density Estimator possesses and set self.n_clusters_ to self.estimator.n_components
Sample Code
from sklearn.mixture import GaussianMixture as GMM
from yellowbrick.cluster import SilhouetteVisualizer from sklearn.datasets import make_blobs
X, y = make_blobs( n_samples=1000, n_features=12, centers=5, shuffle=False, random_state=0 )
Instantiate the clustering model and visualizer model = GMM(n_components = 5, random_state=0) visualizer = SilhouetteVisualizer(model, colors='yellowbrick')
visualizer.fit(X) # Fit the data to the visualizer visualizer.show() # Finalize and render the figure
PLOT
Questions for the @DistrictDataLabs/team-oz-maintainers:
- [ ] Is the try/except clause a viable solution for missing attributes? I foresee this being an issue because I came across a different attribute error with a different clustering estimator. This could get unwieldy.
- [ ]
CHECKLIST
Codecov Report
Merging #1304 (78c8c6a) into develop (f7a8e95) will increase coverage by
0.00%. The diff coverage is100.00%.
@@ Coverage Diff @@
## develop #1304 +/- ##
========================================
Coverage 90.70% 90.71%
========================================
Files 93 93
Lines 5327 5332 +5
========================================
+ Hits 4832 4837 +5
Misses 495 495
| Files Changed | Coverage Δ | |
|---|---|---|
| yellowbrick/cluster/base.py | 100.00% <100.00%> (ø) |
|
| yellowbrick/cluster/silhouette.py | 85.55% <100.00%> (+0.32%) |
:arrow_up: |
| yellowbrick/utils/types.py | 92.15% <100.00%> (+0.49%) |
:arrow_up: |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
@bbengfort Please hold off approving this PR because the fix I added here is already fixed more logically in #1294