elementary icon indicating copy to clipboard operation
elementary copied to clipboard

/Feature Request: Support day_of_month Seasonality in anomaly tests

Open oliviamaquar opened this issue 1 month ago • 2 comments

Is your feature request related to a problem? Please describe. Some datasets I monitor have strong month-based patterns (e.g., billing cycles, contract renewals, month-end processing). Using the existing seasonality options (hour_of_day, day_of_week, hour_of_week) causes recurring monthly spikes and dips to be flagged as anomalies, even though they are expected behavior. This leads to noise and reduces the usefulness of anomaly detection for these tables.

Why this matters

  • Many business processes follow a monthly cadence
  • Monthly spikes or dips are normal and should not be flagged as anomalies
  • Current seasonality settings do not capture these recurring patterns

Describe the solution you'd like Add support for day_of_month as a valid seasonality option in freshness_anomalies and volume_anomalies so the model can learn and adjust to recurring monthly patterns. This would allow expected variations on the 1st, 15th, or end of month to be treated normally rather than flagged.

Proposed enhancement Add day_of_month as an accepted value for the seasonality parameter, enabling models to learn patterns such as:

  • end-of-month transaction volumes
  • 1st-of-month ingestion bursts
  • mid-month reporting cycles

Example desired configuration

tests:
  - elementary.volume_anomalies:
      timestamp_column: created_at
      seasonality: day_of_month
 - elementary.freshness_anomalies:
      timestamp_column: created_at
      seasonality: day_of_month

Describe alternatives you've considered

  • Using day_of_week seasonality, which doesn’t capture monthly patterns
  • Suppressing known monthly outliers manually, which isn’t scalable across multiple models

Would you be willing to contribute this feature? Yes, I’d be happy to collaborate and can provide examples, use cases, and testing support

oliviamaquar avatar Nov 14 '25 23:11 oliviamaquar

Hey @oliviamaquar! Thank you for this. We had some thoughts about this before, and it created a few edge cases around he different amount of days per month (and it would require a longer training period). Let us revisit this - the need here is clear 🙂

NoyaOffer avatar Nov 16 '25 08:11 NoyaOffer

Hey @oliviamaquar ! Had another thought about this for the mean time. We can leverage dimension anomalies, where every dimension is the day of month. Something like this -

models:
  - name: your_model_name
    config:
      elementary:
        timestamp_column: updated_at
    data_tests:
      - elementary.dimension_anomalies:
          arguments:
            dimensions:
              - "EXTRACT(day FROM updated_at)"
            timestamp_column: updated_at

It would give visibility per dimension and search for anomalies there. The current downside is that the visualization would be per dimension and you won't be able to see the full graph.

NoyaOffer avatar Nov 19 '25 16:11 NoyaOffer