machinelearning icon indicating copy to clipboard operation
machinelearning copied to clipboard

Anomaly detection on rainfall data

Open infinitemind2 opened this issue 3 years ago • 2 comments

Not exactly a feature request but a 'is it possible' type question. If so any ideas on the direction to implement.

Have data for various locations with differing length of records. All locations would have differing seasonality and intensities (very different distributions). Looking at updating/inserting records and wish to highlight possible anomalies. The anomalies could be cause by error in the recording process and may need further investigation. (calibration data, typing error, etc)

Have had a look at DetectAnomalyBySrCnn and not sure if I implemented it correctly. Once I Fit the data and create the TimeSeriesEngine and use Predict I get a zero values in the resultant vector. If I run the training data through the Predict function I get results that look promising. However the data appears to be added to the engine and skews further tests. (after I've used known bad data the predict gives different result next time)

Am I barking up the correct tree? If so any suggestions in the direction I should proceed?

I know this is a big ask, thanks for any help.

infinitemind2 avatar Jan 20 '22 00:01 infinitemind2

@luisquintanilla or @JakeRadMSFT are either of you familiar enough with anomaly detection to be able to help answer this question?

@infinitemind2 By default the data is added to the engine, but not to the actual model (unless you do the call to update the model). So if you don't want the new results to skew anything you will need to recreate the prediction engine.

Are you having any specific issues with this? Or mostly just looking for some advice on it?

michaelgsharp avatar Jan 27 '22 19:01 michaelgsharp

Mainly looking for advice/guidance

infinitemind2 avatar Jan 31 '22 04:01 infinitemind2