tods icon indicating copy to clipboard operation
tods copied to clipboard

PyOD for point-wise detection

Open ogreyesp opened this issue 4 years ago • 5 comments
trafficstars

Hi

I'm really interested in using TODS for detecting outliers in multivariate timeseries data. However, I'm missing something. According to the official TODS's documentation:

"Wide-range of Algorithms, including all of the point-wise detection algorithms supported by PyOD, state-of-the-art pattern-wise (collective) detection algorithms such as DeepLog, Telemanon, and also various ensemble algorithms for performing system-wise detection."

So, TODS currently uses PyOD to perform point-wise detection in time series data. However, as it is indicated here, PyOD doesn't handle time series data. So, my question is: How does TODS adapt PyOD for performing point-wise detection in time series data correctly?

Best regards

ogreyesp avatar Jun 11 '21 10:06 ogreyesp

Hi @ogreyesp , PyOD is designed for point cloud as I mentioned in the issue you mentioned. However, once the data is converted into that format from time series, then it is fully applicable. So TODS provides the feature transformation to digest the TS datasets, so that we could use PyOD's algorithms on top of it.

Hope this helps.

yzhao062 avatar Jun 11 '21 16:06 yzhao062

Hi @yzhao062

Thank for your response. So, you are converting a timeseries into a cloud of points that are statistically independent. This solution is acceptable, but methodologically I'm not sure if is it correct 100%. In this case, when selecting the i-th point as an outlier, you are missing the correlation between this point and its previous and next timesteps.

ogreyesp avatar Jun 12 '21 15:06 ogreyesp

Hi @ogreyesp In TODS we have various primitives helping users to extract appropriate features/contexts to address there own need. For example, as you say if we want to model the corelattion between current point and next timestemps, we can use subsequence segmentation to extract the contextual information for each time point and construct the point cloud. Or, another way is to apply alternative algorithms that models temporal correlations within the data directly such as autoregression.

lhenry15 avatar Jun 12 '21 15:06 lhenry15

Hi @lhenry15

Tx!. Very interesting your point. Do you have an example of using the primitive "Subsequence Segmentation" with TODS?

ogreyesp avatar Jun 12 '21 15:06 ogreyesp

Hi @ogreyesp ,

You might want to take a look into benchmark branch, which has some examples to build detection pipelines with subsequence segmentation. We are still working on presenting more examples with IPython notebook, will have more examples coming up later.

lhenry15 avatar Jun 14 '21 15:06 lhenry15