AnomalyDetection icon indicating copy to clipboard operation
AnomalyDetection copied to clipboard

period problem with AnomalyDetectionTs

Open blatoo opened this issue 8 years ago • 11 comments

Hi everybody,

After successfully running the example, I created an own data set, which has the same format like raw_data, I create an myData, which has the same structure as the raw_data. But there still two places are a little different

  • It constains missing value in the second column (raw_data has no missing value)
  • The timestamp is just for one day, the time interval is every 15 seconds. (raw_data has 5 day history and the time interval is every minute)

It looks like: 1 1970-01-01 01:00:55 NA 2 1970-01-01 01:00:10 NA 3 1970-01-01 01:00:25 2.871 4 1970-01-01 01:00:40 2.654 5 1970-01-01 01:00:55 3.060 6 1970-01-01 01:00:10 9.074

after I run the same command like the example:

res = AnomalyDetectionTs(myData, max_anoms=0.02, direction='both', plot=TRUE)

I got the error message:

Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : must supply period length for time series decomposition

How can I fix this problem?

If I don't know the period, can I still find the anomalies?

Thanks very much for the great work!

Best Regards

Conny

blatoo avatar Jul 05 '15 09:07 blatoo

Hi Conny,

I would suggest trying the AnomalyDetectionVec function instead of the TS function. At the moment, the TS function aggregates secondly data into minutely data. The Vec function simply takes a list of values, and then treats them as a time series without the timestamp column. A few things might help when using the Vec function:

  • We suggest replacing all non-leading NAs with interpolated values (see na.approx in Zoo package).
  • Make a best estimate of the period. If there isn't a strong seasonal component, then I might recommend simply removing the trend and applying general ESD to the residual.

Hope that helps.

owenvallis avatar Jul 05 '15 20:07 owenvallis

Hi Owenvallis,

thanks very much for the answer! But I still have another stupid question, what is ESD?

blatoo avatar Jul 06 '15 09:07 blatoo

@blatoo ESD stands for Seasonal Hybrid ESD (S-H-ESD), which is the primary algorithm of this package.

terrytangyuan avatar Jul 31 '15 19:07 terrytangyuan

Hi @terrytangyuan , Thanks very much!!!

blatoo avatar Aug 20 '15 18:08 blatoo

Could anyone close this? Thanks.

terrytangyuan avatar Feb 13 '16 19:02 terrytangyuan

Hey I also ran into the same issue even though my input time series is having a regular interval. Please note I haven't used any NA values. Looks like this issue is still open.

tintojames avatar Feb 23 '16 03:02 tintojames

Hello, I am experiencing a similar error with 1 Hz data. Have there been any developments on this issue since feb?

`> str(data) 'data.frame': 3600 obs. of 2 variables: $ V1: POSIXct, format: "2016-10-29 07:00:00" "2016-10-29 07:00:01" "2016-10-29 07:00:02" ... $ V2: num 28.7 28.7 28.7 28.7 28.7 ... head(data) V1 V2 1 2016-10-29 07:00:00 28.69 2 2016-10-29 07:00:01 28.69 3 2016-10-29 07:00:02 28.70 4 2016-10-29 07:00:03 28.70 5 2016-10-29 07:00:04 28.70 6 2016-10-29 07:00:05 28.71

data_anomaly = AnomalyDetectionTs(data, max_anoms=0.02, direction="pos", plot=TRUE, e_value = T) Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : must supply period length for time series decomposition`

evanhenry avatar Oct 29 '16 20:10 evanhenry

This worked for me.

res = AnomalyDetectionVec(group_prof_10252016[,2], max_anoms=0.02, period=1440, direction='both', only_last=FALSE, plot=TRUE)

On Sat, Oct 29, 2016 at 3:38 PM, Evan Henry [email protected] wrote:

I also noticed that changing the time period in the sample data and code here results in the same error: https://github.com/pablo14/ anomaly_detection_post

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twitter/AnomalyDetection/issues/45#issuecomment-257114646, or mute the thread https://github.com/notifications/unsubscribe-auth/AVIHHWZA1QxiW0IxPRL20LKJjqhzqMTBks5q4662gaJpZM4FSJhj .

jj7353 avatar Nov 01 '16 14:11 jj7353

Hi all, I am quite new to this package and would like to use it for some analysis i am doing. I have data that is not regular ie. trading. Would i be able to use the AnomalyDetection to identify say irregular rices charged? If so, what would i set the "period" to, as on some days there might be a trade every second, or hour, and on some days none? i have data for roughly a year.

Any help will be greatly appreciated! Thanks!

aaishaosman avatar May 23 '17 11:05 aaishaosman

hi jj7353, I want to know about the parameter period, why you choose the period = 1440, how to choose this parameter rightly? @jj7353

thx.

liuguiyangnwpu avatar Jul 09 '18 11:07 liuguiyangnwpu

In the example of raw_data, it is by minute. So 1440 because of 24 hrs * 60 minutes whiwh is equal to 1440

Maryoda2 avatar Jul 10 '18 08:07 Maryoda2