cube icon indicating copy to clipboard operation
cube copied to clipboard

Metabase sql query

Open itestyoy opened this issue 1 month ago • 6 comments

Hi! It seems that the cube doesn't utilize any pre-aggregations with 2 or more time dimensions

 dimensions: [
                        ......
                        CUBE.user_first_touch_date,
                        .......
                     ],
 time_dimension: CUBE.date,
EXPLAIN
SELECT
	CAST("public"."analytics_analysis_view"."date" AS date) AS "date",
	
	DATE_TRUNC('year', "public"."analytics_analysis_view"."user_first_touch_date") AS "user_first_touch_date",
	
	count(DISTINCT "public"."analytics_analysis_view"."distinct_user_count") AS "User Count"
FROM
	"public"."analytics_analysis_view"
WHERE ("public"."analytics_analysis_view"."app_name" = 'xxxx')

	AND("public"."analytics_analysis_view"."date" >= CAST((NOW() + INTERVAL '-160 day') AS date))
	AND("public"."analytics_analysis_view"."date" < CAST(NOW() AS date))
	AND("public"."analytics_analysis_view"."user_dimensions_country_code" = 'US')
	
GROUP BY
	CAST("public"."analytics_analysis_view"."date" AS date),
	DATE_TRUNC('year', "public"."analytics_analysis_view"."user_first_touch_date")
	
ORDER BY
	CAST("public"."analytics_analysis_view"."date" AS date) ASC,
	DATE_TRUNC('year', "public"."analytics_analysis_view"."user_first_touch_date") ASC

CubeScan: request={
 "measures": [
   "analytics_analysis_view.distinct_user_count"
 ],
 "dimensions": [],
 "segments": [],
 "timeDimensions": [
   {
     "dimension": "analytics_analysis_view.date",
     "granularity": "day",
     "dateRange": [
       "2023-12-05T00:00:00.000Z",
       "2024-05-12T23:59:59.999Z"
     ]
   },
   {
     "dimension": "analytics_analysis_view.user_first_touch_date",
     "granularity": "year"
   }
 ],
 "filters": [
   {
     "member": "analytics_analysis_view.app_name",
     "operator": "equals",
     "values": [
       "xxxx"
     ]
   },
   {
     "member": "analytics_analysis_view.user_dimensions_country_code",
     "operator": "equals",
     "values": [
       "US"
     ]
   }
 ]
}

itestyoy avatar May 13 '24 10:05 itestyoy

Hi @itestyoy 👋 Thanks for posting this!

Does anything change if you remove DATE_TRUNC around user_first_touch_date and query it as is?

Let me tag @paveltiunov so see if he has any advice here.

igorlukanin avatar May 13 '24 10:05 igorlukanin

@igorlukanin Hi!

In this case, the request looks like this, and cube utilizes pre-aggregations. But that case is important for performance reasons, especially when using date_trunc in Metabase.

CubeScan: request={
  "measures": [
    "analytics_analysis_view.distinct_user_count"
  ],
  "dimensions": [
    "analytics_analysis_view.user_first_touch_date"
  ],
  "segments": [],
  "timeDimensions": [
    {
      "dimension": "analytics_analysis_view.date",
      "granularity": "day",
      "dateRange": [
        "2024-05-03T00:00:00.000Z",
        "2024-05-12T23:59:59.999Z"
      ]
    }
  ],
  "filters": [
    {
      "member": "analytics_analysis_view.app_name",
      "operator": "equals",
      "values": [
        "xxxx"
      ]
    },
    {
      "member": "analytics_analysis_view.user_dimensions_country_code",
      "operator": "equals",
      "values": [
        "US"
      ]
    }
  ]
}

itestyoy avatar May 13 '24 11:05 itestyoy

All right, so I get that the query above works and matches a pre-aggregation.

What is it that you'd like to change or work differently?

igorlukanin avatar May 14 '24 09:05 igorlukanin

@igorlukanin Use pre aggregation with

DATE_TRUNC('year', "public"."analytics_analysis_view"."user_first_touch_date")

itestyoy avatar May 14 '24 09:05 itestyoy

I think you've got two options to achieve that:

  1. Either create a new pre-aggregation where user_first_touch_date will be a time dimension. (As you know, a pre-aggregation can only be configured to use a single time dimension.)
  2. Or, alternatively, define a new dimension like DATE_TRUNC('year', "public"."analytics_analysis_view"."user_first_touch_date") in your data model, add it to your existing pre-aggregation, and use it in your queries on the Metabase side.

What do you think?

igorlukanin avatar May 14 '24 10:05 igorlukanin

@igorlukanin

Thanks for the suggestions.

Either create a new pre-aggregation where user_first_touch_date will be a time dimension. (As you know, a pre-aggregation can only be configured to use a single time dimension.)

Yes, we did it one pre-aggregation with date and one with user_first_touch_date - and in general its same tables but with 2x storage.

If I understand correctly, the second option is about a new dimension user_first_touch_date_year ?

We use this option in Metabase for BI users. And for most users, it's so convenient than creating a new MB query with new dimension.

image

itestyoy avatar May 14 '24 11:05 itestyoy