pyod
pyod copied to clipboard
specifying categorical features in Python Outlier Detection (PyOD)
How to specify the categorical features in PyOD when using Histogram-based Outlier Detection (HBOS) for anomaly detection ? I've read that HBOS can be used for anomaly detection when there are categorical features involved. I found it's Python implementation here: https://pyod.readthedocs.io/en/latest/pyod.models.html#module-pyod.models.hbos But I can't figure out how should I pass the position or list of names of categorical features of my dataset while training the model. The code I've tried:
clf = HBOS(n_bins=10, alpha=0.1, tol=0.5, contamination=0.1)
clf.fit(train_df)
train_pred = clf.labels_
There is no parameter to mention categorical features while training.
Hi there, Sorry for responding late. Unfortunately, this function has not been implemented. One temporary workaround is to turn your categorical into numerical (not a good idea though). Will update you once have this func in place.
@yzhao062 Thanks. What do you suggest then, Label encoding or OneHot encoding of the categorical features ?
@yzhao062 @shivasheeshyadav , my dataset contains categorical features. Can pyod process categorical features in pyod now? What do you suggest to do with the categorical features?
Any update on this essential feature for categorical anomaly detection?
@Stevod do you have any ideas on how you will handle this. I've been using this library for the last 3 months, and am now realizing the accuracy could be higher, since all the features I'm working with are categorical.