metricflow
metricflow copied to clipboard
[Feature] Support custom calendar
Context
Users want to be able to use a custom calendar, like a fiscal calendar or retail calendar, with Metricflow’s date/time operations. For example, if my company uses fiscal quarters I should be able to request a metric by fiscal quarter.
User Stories
- My company uses a fiscal calendar and I want to see revenue by quarter. When I query
mf query --metrics revenue --group-by metric_time__quarter
. I expect to see revenue broken out by fiscal quarter.- i.e
2024-02-01 - 2023-04-31
is Q1,2024-05-01 - 2024-07-31
is Q2,2024-08-01 = 2024-10-31
is q3,2024-11-01 - 2025-01-31
is Q4
- i.e
- My company uses a 4-5-4 retail calendar i.e the first month of the quarter has 4 weeks, the second has 5 weeks and the third has 4 weeks. When I query
mf query --metrics revenue --group-by metric_time__quarter
the 5-week second month is taken into account. - I want to see metrics using either a standard calendar or a custom calendar.
- I want my week to start on a day other than Monday
Proposed Solution
Our proposed solution is to have users create another date spine model, similar to the metricflow_time_spine
model which has the logic needed for their date transformation expressed in SQL. Alternatively, you can add columns to your current date spine model if you prefer instead of creating a new one.
Next, you will add a configuration file to metricflow that tells us the name of the dbt model that contains your custom calendar, and how the columns map to metricflow’s supported granularity. When you query this calendar we will join the date spine model to the query and use the column specified for each granularity.
The dbt_date package is a good example of what a fiscal calendar would look like, and we have an example table in our internal analytics project called all_dates
. This table format is a good example of the format we expect.
Calendar model requirements
There are a few requirements for the custom calendar model:
- It must be at a daily grain and joinable to other date columns on date_day.
- Columns must map to the data type we expect for a given granularity. For example,
quarter
must be a date.
Configuring your calendars in MetricFlow
Once you’ve created your customer calendar table, you need to tell MetricFlow where this calendar lives and map the columns to supported granularities. There are a few inputs we need from the user to use their custom calendar:
- The name of the custom calendar model.
- The columns in the table that represent custom granularities i.e fiscal quarter. This map back to internal MetricFlow granularities
This configuration in metric flow would look like the following. The full spec is outlined in the interface changes section.
semantic_layer_project_configs:
custom_calanders:
- name: all_dates# name of the dbt model
granularities:
fiscal_quarter: fiscal_quarter_date #column to map to
Calling a custom calendar through our APIs
Once the user has configured their custom calendar the can now call this calendar at query time as a function on metrics. For example:
JDBC query:
- Query
fiscal_calendar
granularity fromall_date
calendar
select
*
from
semantic_layer.query( metrics.calendar('all_dates')=['revenue'], group_by = ['metric_time__fiscal_quarter'])
---The above query compiles compiles to:
select
sum(revenue),
fiscal_quarter as metric_time__fiscal_quarter
from orders ord
left join all_days fc on ord.ordered_at = fc.date_day
- Querying
quarter
using theall_dates
calendar. This defaults to the standard behaviour because there is no value set forquarter
granularity i.edate_trunc(metric_time, 'quarter')
select
*
from
semantic_layer.query( metrics.calendar('all_dates')=['revenue'], group_by = ['metric_time__quarter'])
--The above query compiles to:
select
sum(revenue),
date_trunc(metric_time,'quarter') as metric_time__quarter
from orders ord
CLI query:
mf query --metrics revenue --calendar all_dates --group-by metric_time__fiscal_quarter
Limitations querying metrics with a custom calendar
- You can only choose one calendar for the entire query, not per metric
Why create a custom calendar model and join that model to metrics queries instead of doing the date transformations natively in metricflow?
Date transformations are complex, and organizations will have extremely custom needs. Expressing all the date permutations you might need to do in the spec in a generalized way becomes very difficult. The proposed approach allows you to express the date transformations you need in SQL giving you maximum flexibility. When you’ve finished defining your date transformations, you can expose them to the semantic layer, and query them in a governed way since the columns are mapped back to metricflow granularities. The potential risks called out below are missing date values and possible performance concerns. Both these risks are relatively minimal, and the tradeoff for more flexibility feels worth it.
Requirements:
There are a few requirements/updates needed in MetricFlow to support this.
- Metricflow can take multiple time spine models
- Update our granularity enums to include fiscal quarter and fiscal year. These granularities would require a custom calendar to be set.
- Expose project-level configs to users so they can tell us the name of their custom calendar model and map columns to granularities
- Update APIs to choose which calendar to use for metrics
Interface Changes
You need to do 2 things to set up a custom calendar:
- Create a time spine model in your dbt project that contains the date transformations you need. We can provide examples of common date transformations i.e fiscal quarter
- Tell Metricflow the name of that calendar and which columns map to which granularities
For #2 we have an internal representation of project-level configs where we point to the base time spine model. We can expose this to users and have them configure an additional calendar/time table.
Parameter | Description |
---|---|
semantic_layer_project_configs | Project-level configurations that apply to your entire semantic graph |
semantic_layer_project_configs:calendars | Configuration for a calendar model. This points to a dbt model in your project |
semantic_layer_project_configs:clendars:granularities | It makes it back to one of our granularity enums (currently day, week, month, quarter, year). We will extend this to include fiscal quarter and fiscal year. |
semantic_layer_project_configs:calendars:name | The name of the dbt model |
semantic_layer_project_configs:
calanders:
- name: #the name of the dbt model
granularities:
day: #column name for daily grain in your timespine model
week: #column name for daily grain in your timespine model
month: #column name for daily grain in your timespine model
year: #column name for daily grain in your timespine model
fiscal_quarter: #column name for daily grain in your timespine model
fiscal_year: #column name for daily grain in your timespine model
Risks:
- There may be null dates if not every date in the data set is represented in the custom calendar
- This is a vector for metric values to be different i.e you query a metric using a custom calendar and a regular calendar and get different values
- There is a performance hit requesting a granularity from your custom calendar since we have to do a table scan and join to the calendar model. This should be relatively small. We could maybe do some smart filtering on the calendar table to make to only join dates that are after the minimum time in the data set.
Just to add one more use case to this feature. We also have customers requesting for different start of week (first day of week), either Monday or Sunday when working with weekly granularity queries.
Adding another use case; we're very interested in this for supporting retail calendars / 454 calendars. Without this support we're completely unable to use MetricFlow for our use cases.
@gneius @snikch Just updated the issue with a more detailed explanation of what we're proposing. @gneius Supporting a different week start will be possible with the proposed implementation.
@snikch can you say a bit more about how you envision retail calendars working in metricflow?
Thanks @Jstein77. I'm going to "summon" my colleague @nzben into the convo as he's been working more closely on this than I have.
@Jstein77 if I'm understanding your comprehensive feature overview correctly, I think what you've outlined would work great for us. In our example we have multiple data subsidiaries who may have different retail calendars:
- 4-4-5 quarters with a year that starts on the first Sunday in July
- 4-5-4 quarters with a year that starts on the first week that has 4 or more days in April
- etc
We have built a bunch of SQL functions that can generate the required date spines, and at the moment have these denormalised into fields like date_week_july_sun_4_4_5
so we can just use them as dimensions depending on the subsidiary we are processing data for.
You can only choose one calendar for the entire query, not per metric
This limitation is totally fine. We would not expect to have a query with multiple calendars. We'd more likely have the calendar per subsidiary in our configuration, and all queries for that subsidiary would apply the calendar.
Your notes in "Risks" are the real bonuses for us:
you query a metric using a custom calendar and a regular calendar and get different values
Precisely what we need. "This week vs same week last year" would be different between two subsidiaries with different week start days.
We could maybe do some smart filtering on the calendar table to make to only join dates that are after the minimum time in the data set.
We do a form of this manually at the moment using a CTE to select the date range for a particular set of fiscal periods, then use the results of that (start and end dates) to filter the subsequent queries.