auto-sklearn icon indicating copy to clipboard operation
auto-sklearn copied to clipboard

Feature Request: AutoSklearnOutlierDetector

Open Y-oHr-N opened this issue 7 years ago • 5 comments

Hello,

scikit-learn 0.20 provides more consistent outlier detection API. https://speakerdeck.com/albertcthomas/anomaly-detection-in-scikit-learn-ongoing-work-and-future-developments

  • covariance.EllipticEnvelope
  • svm.OneClassSVM
  • ensemble.IsolationForest
  • neighbors.LocalOutlierFactor

So I want an estimator that fits all outlier detection models like AutoSklearnClassifier.

Thank you.

Y-oHr-N avatar Nov 07 '18 06:11 Y-oHr-N

Just for clarification, do you think that these should be part of the pipeline tuned by Auto-sklearn or that there should be a standalone mode AutoSklearnOutlierDetector?

According to the title you want the second thing. From my understanding, this is an unsupervised learning problem. The central assumption in Auto-sklearn is that there as a loss function which can be used to tune the hyperparameters. What would such a loss function look like for outlier detection?

mfeurer avatar Nov 19 '18 12:11 mfeurer

Thank you for your reply. As far as I know, threre are two metrics for outlier function.

One is the square of the geometric mean of precision and recall.

outliers - Metrics for one-class classification - Cross Validated https://stats.stackexchange.com/questions/192530/metrics-for-one-class-classification Lee, W. S, and Liu, B., "Learning with positive and unlabeled examples using weighted Logistic Regression," In Proceedings of ICML, pp. 448-455, 2003. https://www.aaai.org/Papers/ICML/2003/ICML03-060.pdf

The other is the area under the Mass-Volume curve.

Goix, N., "How to evaluate the quality of unsupervised anomaly detection algorithms?" In ICML Anomaly Detection Workshop, 2016. https://arxiv.org/pdf/1607.01152.pdf Thomas, A., Clémençon, S., Feuillard, V., and Gramfort, A., "Learning hyperparameters for unsupervised anomaly detection," In ICML Anomaly Detection Workshop, 2016. https://github.com/albertcthomas/anomaly_tuning

I implemented two scikit-learn compatible metrics. https://github.com/HazureChi/kenchi/blob/master/kenchi/metrics.py

Y-oHr-N avatar Nov 21 '18 02:11 Y-oHr-N

I'm afraid that I won't have the time to implement something here. Also, I think this is somewhat out of scope for Auto-sklearn if the metrics are not in scikit-learn yet.

mfeurer avatar Nov 30 '18 12:11 mfeurer

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions.

github-actions[bot] avatar May 05 '21 01:05 github-actions[bot]

Hi @mfeurer,

Is it possible to create a customized one-class SVM as a two-class SVM, and then put it into AutoSklearnClassifier? What I'm trying to do is

  1. add a customized classifier (input: a one-class SVM, and X_train and pseudo_y_train)
  2. make a customized score if pseudo_y_train are all 0 (only one class), then the score is 1e-5; otherwise, give a higher socre if it classifies outliers correctly
  3. put the customized classifier and the customized score into AutoSklearnClassifier

Does it sound reasonable and workable?

Any comments are highly appreciated.

JM

jmren168 avatar Feb 24 '23 06:02 jmren168