dbt-re-data
dbt-re-data copied to clipboard
Specific anomaly tuning for colums
What
Have flexibility at the column level to define anomalies. 3 new features:
- define anomaly_detector on the column level, which overrides the model anomaly_detector
- have absolute thresholds for the values
- control the alerts with the change_percentage
Resolves https://github.com/re-data/re-data/issues/413
How
I am bringing a new model re_data_selected_columns
that has the config per each column. When a column config is present, re_data_anomalies
will read from re_data_selected_columns
instead of re_data_selected
.
Example of column anomaly config:
- name: test_data
config:
re_data_monitored: true
re_data_time_filter: random_date
re_data_anomaly_detector:
name: z_score
threshold: 3
re_data_metrics_groups:
- table_metrics
re_data_metrics:
column:
price:
- avg:
re_data_anomaly_detector:
name: z_score
threshold: 0.1
direction: both
absolute_threshold:
threshold: 5200
direction: up
change_percentage:
threshold: 5
direction: up
customer_id:
- distinct_values
- avg
In the case above, for the avg(price) we will consider a strict z_score of 0.1 which overrides the values 3 on the model. Moreover, the values should be higher than 5200 and the percentage of change has to be higher than 5% for this to become an anomaly