Partition by dimension in metrics_anomaly_score

Open nescobar opened this issue 1 year ago • 1 comments

Describe the bug In metrics_anomaly_score.sql, the metric_value is not partitioned by dimensions when using the dimension properties. This affects the calculation of the anomaly score since it is derived from the average of the metric values across ALL dimensions.

To Reproduce Steps to reproduce the behavior:

In the code below, the metric_value is not being partitioned by dimensions:

avg(metric_value) over (partition by metric_name, full_table_name, column_name order by bucket_start asc rows between unbounded preceding and current row) as training_avg

Expected behavior The average metric_value should be partitioned by dimension_value when dimensions are being used

avg(metric_value) over (partition by metric_name, full_table_name, column_name, dimension_value order by bucket_start asc rows between unbounded preceding and current row) as training_avg

Oct 24 '24 00:10 nescobar

We are encountering the same thing here. Just to add to what @nescobar mentioned, in cases where there may be multiple dimensions named the same thing, I think we would also want to include dimension in the partition, as well.

avg(metric_value) over (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_start asc rows between unbounded preceding and current row) as training_avg

Jan 21 '25 16:01 wbarth11