dbt-re-data icon indicating copy to clipboard operation
dbt-re-data copied to clipboard

Specific anomaly tuning for colums

Open suelai opened this issue 1 year ago • 0 comments

What

Have flexibility at the column level to define anomalies. 3 new features:

  • define anomaly_detector on the column level, which overrides the model anomaly_detector
  • have absolute thresholds for the values
  • control the alerts with the change_percentage

Resolves https://github.com/re-data/re-data/issues/413

How

I am bringing a new model re_data_selected_columns that has the config per each column. When a column config is present, re_data_anomalies will read from re_data_selected_columns instead of re_data_selected. Example of column anomaly config:

      - name: test_data
        config:
          re_data_monitored: true
          re_data_time_filter: random_date
          re_data_anomaly_detector:
            name: z_score
            threshold: 3
          re_data_metrics_groups:
            - table_metrics
          re_data_metrics:
            column:
              price:
                - avg:
                    re_data_anomaly_detector:
                      name: z_score
                      threshold: 0.1
                      direction: both
                    absolute_threshold:
                      threshold: 5200
                      direction: up
                    change_percentage:
                      threshold: 5
                      direction: up

              customer_id:
                - distinct_values
                - avg

In the case above, for the avg(price) we will consider a strict z_score of 0.1 which overrides the values 3 on the model. Moreover, the values should be higher than 5200 and the percentage of change has to be higher than 5% for this to become an anomaly

suelai avatar Dec 27 '23 15:12 suelai