cube icon indicating copy to clipboard operation
cube copied to clipboard

Cannot pre-aggregate rolling measure with `count_distinct`

Open naerymdan opened this issue 1 year ago • 9 comments

Describe the bug Trying to pre-aggregate a rolling_window measure give the error Error: Time series queries without dateRange aren't supported at BaseTimeDimension.timeSeries...

To Reproduce Create a measure with a rolling_window with a trailing start, create a pre-aggregation for it.

Expected behavior I expect the pre-aggregation to function correctly.

Minimally reproducible Cube Schema

cube(`session`, {
  sql_table: `test_db`,
  data_source: `default`,
  measures: {
    weeklyCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
          leading: `1 week`,
          offset: `start`,
        },
    },
  },
  dimensions: {
    created_at: {
      sql: `${CUBE}."created_at"`,
      type: `time`
    }
  },
  preAggregations: {
    main: {
      measures: [session.weeklyCount],
      timeDimension: session.created_at,
      granularity: `day`
    }
  }
});

Version: 0.34.50

Additional context Using PostgreSQL, but might be the same issue as #6813 who is using MS SQL

naerymdan avatar Feb 06 '24 16:02 naerymdan

I've tried to build the pre-aggregation from the following data model on v0.35.18 with Postgres, and it worked for me:

cube(`session`, {
  sql: `
    SELECT 1 AS id, '2024-01-01T00:00:00.000Z' AS created_at UNION ALL
    SELECT 2 AS id, '2024-01-02T00:00:00.000Z' AS created_at UNION ALL
    SELECT 3 AS id, '2024-01-03T00:00:00.000Z' AS created_at
  `,
  data_source: `default`,
  measures: {
    weeklyCount: {
      sql: `id`,
      type: `count`,
      rollingWindow: {
          leading: `1 week`,
          offset: `start`,
        },
    },
  },
  dimensions: {
    created_at: {
      sql: `${CUBE}."created_at"`,
      type: `time`
    }
  },
  preAggregations: {
    main: {
      measures: [session.weeklyCount],
      timeDimension: session.created_at,
      granularity: `day`
    }
  }
});
Screenshot 2024-04-17 at 19 19 59 Screenshot 2024-04-17 at 19 20 03

Could you please upgrade to the latest version and double-check again?

igorlukanin avatar Apr 17 '24 17:04 igorlukanin

Closing this issue due to inactivity and lack of a reproduction. Please feel free to reopen.

igorlukanin avatar May 14 '24 10:05 igorlukanin

I experience this issue too. Using count_distinct_approx and count works fine, but using count_distinct results in the mentioned error.

nbphuoc avatar May 25 '24 06:05 nbphuoc

Same thing for me as well!

alrolorojas avatar May 25 '24 22:05 alrolorojas

using count_distinct results in the mentioned error

OK, I was able to reproduce it with count_distinct. Also, adding partition_granularity, build_range_start, and build_range_end does not help.

I'd love to hear what @paveltiunov thinks.

igorlukanin avatar May 31 '24 11:05 igorlukanin

As a workaround, I can maybe recommend using count_distinct_approx with Postgres HLL extension: https://cube.dev/docs/product/configuration/data-sources/postgres#count_distinct_approx

igorlukanin avatar May 31 '24 12:05 igorlukanin

As a workaround, I can maybe recommend using count_distinct_approx with Postgres HLL extension: https://cube.dev/docs/product/configuration/data-sources/postgres#count_distinct_approx

I would love to use count_distinct_approx, but unfortunately, Cube.js doesn't support count_distinct_approx for Redshift pre-aggregation, even though Redshift supports HLL itself. Will this feature be added soon?

nbphuoc avatar Jun 03 '24 17:06 nbphuoc