Exclude outliers from training in volume tests
Is your feature request related to a problem? Please describe. When performing volume tests, it may happen that a sudden spike (or drop) in a bucket will completey change the training, increasing the possibility of false negatives
Describe the solution you'd like Points that have generated a failure should be excluded from the training set
Describe alternatives you've considered
By correctly setting anomaly_exclude_metrics, is it possible to achieve the same but the point will not be visualized in the dashboard.
Example:
anomaly_exclude_metrics: ((metric_value - AVG(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) / STDDEV(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) >= 2
Additional context
The top graph is the one with the modified anomaly_exclude_metrics. It should visualize all the data points but only exclude the Jul 2 point from the training
Would you be willing to contribute this feature? Currently I don't have much time but it could be possible in the future
Hi @bebbo203 , thanks for opening this issue.
Totally makes sense and it's definitely something we're considering. anomaly_exclude_metrics is a workaround but definitely not the ideal one.