metricflow
metricflow copied to clipboard
[SL-1732] [Bug] Where filter is applied to inner query when using `join_to_timespine`
Is this a new bug in metricflow?
- [X] I believe this is a new bug in metricflow
- [X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
If I use a where filter when querying metrics that are joined to a time spine the filter is applied to the inner query before the timespine join. This means data is filtered out, then joined to the time spine so I see dates in the final result set that I expect to be filtered out.
For example, I issued the following query in the jaffle-sl-template repo.
dbt sl query --metrics twice_orders_fill_nulls_with_0,mqls --group-by metric_time --where "{{ TimeDimension('metric_time')
> '2020-01-01'" --order-by metric_time desc --compile}}
The data return includes records from 2014, which is when my time spine model starts
In the compiled SQL you can see that the filter is gettings applied to the metric sub query.
SELECT
COALESCE(subq_5131.metric_time**day, subq_5141.metric_time**day) AS metric_time__day
, COALESCE(MAX(subq_5131.mqls), 0) AS mqls
, MAX(subq_5141.twice_orders_fill_nulls_with_0) AS twice_orders_fill_nulls_with_0
FROM (
SELECT
metric_time__day
, COALESCE(mqls, 0) AS mqls
FROM (
SELECT
subq_5129.date_day AS metric_time__day
, subq_5127.mqls AS mqls
FROM dbt_jstein.metricflow_time_spine subq_5129
LEFT OUTER JOIN (
SELECT
metric_time__day
, SUM(mqls) AS mqls
FROM (
SELECT
DATE_TRUNC('day', first_contact_date) AS metric_time__day
, CASE WHEN mql_id IS NOT NULL THEN 1 ELSE 0 END AS mqls
FROM ANALYTICS.dbt_jstein.olist_marketing_qualified_leads olist_mqls_src_678
) subq_5125
WHERE metric_time__day > '2020-01-01'
GROUP BY
metric_time__day
) subq_5127
ON
subq_5129.date_day = subq_5127.metric_time__day
) subq_5130
) subq_5131
FULL OUTER JOIN (
SELECT
metric_time__day
, 2 * orders_fill_nulls_with_zero AS twice_orders_fill_nulls_with_0
FROM (
SELECT
metric_time__day
, COALESCE(order_count, 0) AS orders_fill_nulls_with_zero
FROM (
SELECT
subq_5138.date_day AS metric_time__day
, subq_5136.order_count AS order_count
FROM dbt_jstein.metricflow_time_spine subq_5138
LEFT OUTER JOIN (
SELECT
metric_time__day
, SUM(order_count) AS order_count
FROM (
SELECT
DATE_TRUNC('day', ordered_at) AS metric_time__day
, 1 AS order_count
FROM ANALYTICS.dbt_jstein.orders orders_src_680
) subq_5134
WHERE metric_time__day > '2020-01-01'
GROUP BY
metric_time__day
) subq_5136
ON
subq_5138.date_day = subq_5136.metric_time__day
) subq_5139
) subq_5140
) subq_5141
ON
subq_5131.metric_time**day = subq_5141.metric_time**day
GROUP BY
COALESCE(subq_5131.metric_time**day, subq_5141.metric_time**day)
ORDER BY metric_time__day
LIMIT 100
I would expect the filter to also be applied to the final sub query to only show the dates I requested. i.e
ON
subq_5131.metric_time**day = subq_5141.metric_time**day
WHERE
COALESCE(subq_5131.metric_time**day, subq_5141.metric_time**day) > '2020-01-01'
GROUP BY
COALESCE(subq_5131.metric_time**day, subq_5141.metric_time**day)
ORDER BY metric_time__day
Expected Behavior
I would expect the result set to start on 2020-01-01
Steps To Reproduce
- Checkout main on the jaffle-sl-template repo
- Run the following query
dbt sl query --metrics twice_orders_fill_nulls_with_0,mqls --group-by metric_time --where "{{ TimeDimension('metric_time')
> '2020-01-01'" --order-by metric_time desc --compile}}
Relevant log output
No response
Environment
- OS:
- Python:
- dbt:
- metricflow:
Which database are you using?
No response
Additional Context
No response
From SyncLinear.com | SL-1732