elementary icon indicating copy to clipboard operation
elementary copied to clipboard

Exclude outliers from training in volume tests

Open bebbo203 opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. When performing volume tests, it may happen that a sudden spike (or drop) in a bucket will completey change the training, increasing the possibility of false negatives

Describe the solution you'd like Points that have generated a failure should be excluded from the training set

Describe alternatives you've considered By correctly setting anomaly_exclude_metrics, is it possible to achieve the same but the point will not be visualized in the dashboard. Example: anomaly_exclude_metrics: ((metric_value - AVG(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) / STDDEV(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) >= 2

Additional context image

The top graph is the one with the modified anomaly_exclude_metrics. It should visualize all the data points but only exclude the Jul 2 point from the training

Would you be willing to contribute this feature? Currently I don't have much time but it could be possible in the future

bebbo203 avatar Jul 04 '24 06:07 bebbo203

Hi @bebbo203 , thanks for opening this issue. Totally makes sense and it's definitely something we're considering. anomaly_exclude_metrics is a workaround but definitely not the ideal one.

haritamar avatar Oct 01 '24 11:10 haritamar