yellowbrick icon indicating copy to clipboard operation
yellowbrick copied to clipboard

Enhance PCA Decomposition

Open bbengfort opened this issue 7 years ago • 12 comments

We should enhance the PCADecomposition visualizer to provide many of the features the Manifold visualizer provides, including things like:

  • [ ] Color points by class with a legend (See #458)
  • [ ] Color points by heatmap for continuous y and add a colorbar
  • [ ] Add alpha parameter (see #475)
  • [ ] Add random state to pass to PCA
  • [ ] Allow user to pass in a PCA transformer/pipeline
  • [ ] Update tests with better random data sets (more points; see manifold tests)
  • [ ] Include explained variance/noise variance (or explained variance ratio) in chart
  • [ ] Enhance biplots documentation

See also #455 as another enhancement that might not be related to this enhancement.

bbengfort avatar Jun 14 '18 14:06 bbengfort

Hey! i'm interested in tackling this.

rohit-ganapathy avatar Feb 18 '19 07:02 rohit-ganapathy

@rohit-ganapathy - that would be great, feel free to open a PR when you're ready for us to take a look.

bbengfort avatar Feb 18 '19 15:02 bbengfort

Can I start working on this,even if @rohit-ganapathy is assigned?

dnabanita7 avatar Feb 19 '19 00:02 dnabanita7

Hello @Naba7 — as we explained last week in response to your questions on #738 and #677, we do not "assign" issues or reserve issues for contributors. Anyone is welcome to submit a PR for a feature or bugfix they work on.

However, given that you already have one PR open that still needs to be completed (#755), have started working on #615, and are new to working on Yellowbrick and still getting to know our API, we would really appreciate if you would focus on getting those first PRs across the finish line before starting anything new.

We appreciate your enthusiasm about contributing to Yellowbrick. One of the most important lessons to learn is that open source is a marathon, not a sprint, so we hope you can be patient and enjoy the journey — we promise Yellowbrick isn't going away!

rebeccabilbro avatar Feb 19 '19 02:02 rebeccabilbro

It's so exciting and fun. I want to know and learn things quick. So,I am asking questions to get assigned everywhere. Sorry I have noted this now.

On Tue 19 Feb, 2019, 7:48 AM Rebecca Bilbro <[email protected] wrote:

Hello @Naba7 https://github.com/Naba7 — as we explained last week in response to your questions on #738 https://github.com/DistrictDataLabs/yellowbrick/issues/738 and #677 https://github.com/DistrictDataLabs/yellowbrick/issues/677, we do not "assign" issues or reserve issues for contributors. Anyone is welcome to submit a PR for a feature or bugfix they work on.

However, given that you already have one PR open that still needs to be completed (#755 https://github.com/DistrictDataLabs/yellowbrick/pull/755), have started working on #615 https://github.com/DistrictDataLabs/yellowbrick/issues/615, and are new to working on Yellowbrick and still getting to know our API, we would really appreciate if you would focus on getting those first PRs across the finish line before starting anything new.

We appreciate your enthusiasm about contributing to Yellowbrick. One of the most important lessons to learn is that open source is a marathon, not a sprint, so we hope you can be patient and enjoy the journey — we promise Yellowbrick isn't going away!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DistrictDataLabs/yellowbrick/issues/476#issuecomment-464952251, or mute the thread https://github.com/notifications/unsubscribe-auth/AeGb9yDVmfn6QhUbUjh1no0PM6763S9eks5vO177gaJpZM4UoE6- .

dnabanita7 avatar Feb 19 '19 02:02 dnabanita7

@naresh-bachwani has this issue been fixed by your work this summer?

bbengfort avatar Aug 29 '19 00:08 bbengfort

@bbengfort I think that explained variance charts are left! But that will be covered in decomposition, right?

naresh-bachwani avatar Aug 31 '19 16:08 naresh-bachwani

@naresh-bachwani ExplainedVariance is separate to this issue. Would you mind ticking the checkboxes above based on your work?

bbengfort avatar Sep 04 '19 17:09 bbengfort

Can these functions be applied to FastICA in Scikit-Learn (or maybe any ICA)? Also observing https://github.com/DistrictDataLabs/yellowbrick/issues/615 and https://github.com/DistrictDataLabs/yellowbrick/issues/316

BradKML avatar Mar 12 '22 10:03 BradKML

@BrandonKMLee very possibly, it wouldn't hurt to try. I think what you'd have to do is change the pca_transformer attribute on the PCA visualizer; establishing it as a pipeline similar to the code here: https://github.com/DistrictDataLabs/yellowbrick/blob/develop/yellowbrick/features/pca.py#L184-L189. This would have to be done after initialization before any call to fit or transform. I don't see any place it wouldn't work, unless FastICA or ICA doesn't have required attributes like n_components_.

You could also try passing an initialized FastICA or ICA transformer as the manifold attribute to the Manifold visualizer - this might not give you the same features as ICA, but should give you the projected visualization.

bbengfort avatar Mar 12 '22 13:03 bbengfort

@bbengfort n_components_in_ for FastICA, but at the same time explained variance could be a problem, as each components are expected to have well-distributed significance instead of being ordered, and also such a function currently does not exist for FastICA.

BradKML avatar Mar 12 '22 14:03 BradKML

@BrandonKMLee ok, that makes sense so potentially FastICA make not work unless we create a specialized manifold for them.

bbengfort avatar May 21 '22 18:05 bbengfort