Added usage examples
Docs
Added usage examples to: decomposition.umap_reconstruction.UMAPOutlierDetection decomposition.pca_reconstruction.PCAOutlierDetection
Fixes #652 and #653
Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Wondering what happend here
Oh gee, there are even more failed tests now.😮
About examples
I copied the approach with the arrays from scikit-learn docs on PCA (check ss)
and I tinkered around with the values to check how the detectors work for different values in the arrays, n_components and thresholds. I tried to add values that would be clearly an outlier because of the quantiles for example [-100, 99, -99] but the PCA/UMAPOutlierDetectors "classified" them as inliers. I also saw in User Guide that the values classified by PCA/UMAPOutlierDetectors as outliers, don't look like quantiles based outliers -so you can't spot them just by looking. If that makes any sense.
Yep the doc page is using iris dataset, which I would not expect to have any particular outlier. We have one obvious example in the test suite.
@koaning thoughts on this? In my opinion, it could be worth it to change dataset in the user guide as well. It seems a bit confusing
Sure, I'll do it the way it's done in the test suite.
Yep the doc page is using iris dataset, which I would not expect to have any particular outlier. We have one obvious example in the test suite.
Hey Francesco I'm back at it. I was having some thoughts about the examples for PCA and UMAPs, what bugs me is that if I use the obvious example which is a 10-d array, how can I show the resulting outliers? Should I print the 10d output? I mean that the numbers in the arrays I used may seem arbitrary, but at least we can show the outliers in a simple one line output. Let me know wdyt