cube icon indicating copy to clipboard operation
cube copied to clipboard

DBT integration fails when no model is present in the metric

Open lyogev opened this issue 1 year ago • 6 comments

Describe the bug Tried to use the DBT integration with the following code:

import dbt from '@cubejs-backend/dbt-schema-extension'

asyncModule(async () => {
  await dbt.loadMetricCubesFromDbtProject('/dbt');
});

And got the following failure

dbt-test-cube-1  | {
dbt-test-cube-1  |   "securityContext": {}
dbt-test-cube-1  | }
dbt-test-cube-1  | TypeError: Cannot read property 'match' of null
dbt-test-cube-1  |     at /cube/node_modules/@cubejs-backend/dbt-schema-extension/src/Dbt.ts:150:37
dbt-test-cube-1  |     at Array.forEach (<anonymous>)
dbt-test-cube-1  |     at Dbt.loadMetricCubesFromDbtProject (/cube/node_modules/@cubejs-backend/dbt-schema-extension/src/Dbt.ts:147:35)
dbt-test-cube-1  |     at dbt.js:4:3
dbt-test-cube-1  |     at DataSchemaCompiler.compileCubeFiles (/cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/DataSchemaCompiler.js:143:5)
dbt-test-cube-1  |     at CompilerApi.getCompilers (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:56:24)
dbt-test-cube-1  |     at CompilerApi.scheduledPreAggregations (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:151:31)
dbt-test-cube-1  |     at RefreshScheduler.roundRobinRefreshPreAggregationsQueryIterator (/cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:382:38)
dbt-test-cube-1  |     at /cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:503:12
dbt-test-cube-1  |     at async Promise.all (index 0)
dbt-test-cube-1  |     at async Promise.all (index 1)
dbt-test-cube-1  |     at RefreshScheduler.runScheduledRefresh (/cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:199:9)
dbt-test-cube-1  |     at async Promise.all (index 0)
dbt-test-cube-1  |     at Timeout._onTimeout (/cube/node_modules/@cubejs-backend/shared/src/promises.ts:139:9)

This is due to the following code https://github.com/cube-js/cube.js/blob/794708e7ea3d540afdd86c58b32bab1c6a0d89c4/packages/cubejs-dbt-schema-extension/src/Dbt.ts#L150

It requires a model present in the metric to parse it.

However, model is not a required field, check out jaffle_shop_metrics example: https://github.com/dbt-labs/jaffle_shop_metrics/blob/main/models/marts/profit.yml

lyogev avatar Aug 08 '22 12:08 lyogev

Hey @lyogev ! Indeed this type of metric isn't supported yet. Could you please share your dbt project and elaborate on your use case?

paveltiunov avatar Aug 15 '22 06:08 paveltiunov

Hi @paveltiunov just check out DBT's own example project: https://github.com/dbt-labs/jaffle_shop_metrics/

Basically they've added support for something they call expression metrics Check it out here https://docs.getdbt.com/docs/building-a-dbt-project/metrics#expression-metrics

lyogev avatar Aug 19 '22 05:08 lyogev

Hey @paveltiunov

This is also an issue for me. I have a use case where one metric active users is coming from a dim_users and another sessions is coming from a fact_events. I want to calculate a sessions per user ratio of the 2 metrics. I defined in dbt this metric type as expression.

Thanks

chaos87 avatar Aug 20 '22 01:08 chaos87

@lyogev @chaos87 Would these metrics be defined for the same model or for different ones?

paveltiunov avatar Aug 23 '22 02:08 paveltiunov

@paveltiunov In my case it comes from 2 distinct models

If I believe dbt docs it's using dimensions and the time grain provided in order to join on the 2+ models https://docs.getdbt.com/docs/building-a-dbt-project/metrics#expression-metrics

I haven't tried the dbt_metrics package to see how the query gets built

chaos87 avatar Aug 23 '22 05:08 chaos87

@chaos87 It means it's a data blending use case that most likely should be done through UNION ALL instead of JOIN. I'm curious to see how dbt_metrics handles that.

paveltiunov avatar Aug 28 '22 17:08 paveltiunov

@paveltiunov I get the same error as @lyogev when I define a dbt expression metric for a ratio metric based off two metrics coming from the same model

My dbt metrics.yml. The error occurs when I add the metric pageviews_per_user

  - name: users
    label: Active Users
    model: ref('fact_events')
    description: "Number of active users"
    type: count_distinct
    sql: user_key
  
  - name: pageviews
    label: Page Views
    model: ref('fact_events')
    description: "Number of page view events"
    type: sum
    sql: IF(is_page_view, 1, 0)

  - name: pageviews_per_user
    label: Pageviews Per User
    description: "Average number of pageviews per user"
    type: expression
    sql: "{{metric('pageviews')}} / {{metric('users')}}"

I can try to block some time next month to work on a PR if this is welcomed on your side (we need dbt for catalog of metrics in my company, Cube is for querying them on the fly)

chaos87 avatar Sep 26 '22 02:09 chaos87

Hey folks. We're also looking to use derived metrics as part of our flow. Has there been any update on this?

SkinnyPigeon avatar Feb 24 '23 11:02 SkinnyPigeon

Due to the change in direction at dbt, we are deprecating @cubejs-backend/dbt-schema-extension. When a new dbt metrics package becomes available, we will update and reintroduce our integration.

keydunov avatar May 09 '23 23:05 keydunov

Hi @keydunov thanks for letting us know. We'd certainly be interested if you do look back into it after the release of dbt v1.6 when they roll their dbt_metrics package into the core platform complete with their integration with MetricFlow

SkinnyPigeon avatar May 10 '23 07:05 SkinnyPigeon