cube icon indicating copy to clipboard operation
cube copied to clipboard

Feedback for “Subquery dimensions”

Open mfclarke-cnx opened this issue 1 year ago • 7 comments

I was looking at this and it seemed to be what I needed. However, the page doesn't discuss usage of Subquery dimensions when you are querying with a time dimension. When I use it with a time dimension, the subquery doesn't account for time. It is just a general query that calculates over all data and doesn't group by the time granularity or filter within the date range specified in the query.

For the example given (average sales per sales manager) it seems natural that you can extend this by adding a time dimension so you can graph average sales per manager over time. But that's not possible with Subquery dimensions. What you get as a result is each day showing the average of sales managers who had a deal on said day, but the average amount is the average of each manager's total sales over all time, rather than during the day in question.

I think either I've found a bug, or a better example should be given where it doesn't matter that the subquery doesn't account for time. Otherwise others will be misled by the intention of Subquery dimensions and spend a lot of time scratching their head, like me!

mfclarke-cnx avatar Feb 11 '24 21:02 mfclarke-cnx

Here's the resulting SQL when I add a time dimension with day granularity btw. I'm using MS SQL.

SELECT
  TOP 10000q_0."deals__date_of_deal_day",
  "sales_managers__average_deal_amount" "sales_managers__average_deal_amount"
FROM
  (
    SELECT
      "keys"."deals__date_of_deal_day",
      avg("sales_managers__deals_amount") "sales_managers__average_deal_amount"
    FROM
      (
        SELECT
          DISTINCT dateadd(
            day,
            DATEDIFF(
              day,
              0,
              TODATETIMEOFFSET("sales_managers_key__deals".deal_date, '+00:00')
            ),
            0
          ) "deals__date_of_deal_day",
          "sales_managers_key__sales_managers".id "sales_managers__id"
        FROM
          sales_managers AS "sales_managers_key__sales_managers"
          LEFT JOIN deals AS "sales_managers_key__deals" ON "sales_managers_key__sales_managers".id = "sales_managers_key__deals".sales_manager_id
      ) AS "keys"
      LEFT JOIN sales_managers AS "sales_managers_key__sales_managers" ON "keys"."sales_managers__id" = "sales_managers_key__sales_managers".id
      LEFT JOIN (
        SELECT
          "sales_managers_deals_amount_subquery__sales_managers".id "sales_managers__id",
          sum(
            "sales_managers_deals_amount_subquery__deals".amount
          ) "sales_managers__deals_amount"
        FROM
          sales_managers AS "sales_managers_deals_amount_subquery__sales_managers"
          LEFT JOIN deals AS "sales_managers_deals_amount_subquery__deals" ON "sales_managers_deals_amount_subquery__sales_managers".id = "sales_managers_deals_amount_subquery__deals".sales_manager_id
        GROUP BY
          "sales_managers_deals_amount_subquery__sales_managers".id
      ) AS "sales_managers_deals_amount_subquery" ON "sales_managers_deals_amount_subquery"."sales_managers__id" = "sales_managers_key__sales_managers".id
    GROUP BY
      "keys"."deals__date_of_deal_day"
  ) as q_0
ORDER BY
  1 ASC

mfclarke-cnx avatar Feb 11 '24 21:02 mfclarke-cnx

Hi @mfclarke-cnx 👋

Let's zoom out a little bit here. What is your use case and what are you trying to calculate? Showing an example of data and desired result would help.

igorlukanin avatar Feb 20 '24 12:02 igorlukanin

Hi @igorlukanin, let me give a different example than the one given in the initial issue. I'll still use the same context as you have used in the docs for this feature (sales managers and deals).

Lets say for example I am trying to shift our sales strategy to focus on a lower number of high value deals, instead of a high number of low values deals. I would like to measure how each sales manager is doing with this objective over time. I need to see the maximum deal amount for each sales manager each day over time. It seems like subquery dimensions would be a good fit here after a cursory read of the documentation. However the detail that subquery dimensions don't work with a time dimension is missing from the documentation.

My feedback is just on the documentation itself (I clicked the "Question? Give us feedback" link on the page). With the caveat that I might be misunderstanding something, or maybe this is a bug and it is supposed to support time dimensions properly?

mfclarke-cnx avatar Mar 11 '24 01:03 mfclarke-cnx

Thanks for elaborating on this @mfclarke-cnx!

However the detail that subquery dimensions don't work with a time dimension is missing from the documentation.

How would you use it with a time dimension, exactly? I'm really trying to understand that.

igorlukanin avatar Mar 20 '24 15:03 igorlukanin

How would you not is a better question I think :) Without a time dimension used by the sub query, you can't usefully chart data that uses subquery dimensions over time. The subquery dimension values would always be for all data for all time, rather than for the specific time granule (day, hour etc) being specified by the query.

This is what leads me to think this might be a bug with the MS SQL implementation and perhaps this works correctly with other data sources.

mfclarke-cnx avatar Mar 20 '24 21:03 mfclarke-cnx