cube icon indicating copy to clipboard operation
cube copied to clipboard

Intermittently incorrect values when using calculated measures on pre-aggregated leaf measures

Open Kelvin4702 opened this issue 1 year ago β€’ 11 comments

Describe the bug Querying the pre-aggregate using the REST API will intermittently generate incorrect results. Upon inspecting the logs, it appears that:

  • The generated SQL statement is correct
  • However, the cubestore query planner forms incorrect aliases, causing measures to be swapped around

This behavior is not consistent -- similar REST API queries that should return the same result will occasionally be correct.

To Reproduce Using the following REST API query:

{
  "query": {
    "measures": [
      "sales.this_year_sales",
      "sales.this_year_units",
      "sales.yoy_sales_chg",
      "sales.yoy_units_chg"
    ],
    "dimensions": [
      "sales.product_type"
    ],
    "filters": [
      {
        "member": "sales.product_type",
        "operator": "equals",
        "values": ["PET_FOOD"]
      },
      {
        "member": "sales.date",
        "operator": "inDateRange",
        "values": ["2025-01-18","2025-01-24"]
      }
    ],
    "limit": 5,
    "order": {
      "sales.this_year_sales": "desc"
    },
    "offset": 0
  }
}
  1. Execute the query - observe that results may or may not be correct
  2. Modify the limit parameter value, which should have no effect on the query results
  3. Execute the modified query - observe that results may or may not be correct
  4. Repeat steps 2 and 3 as necessary until incorrect results are returned (personal observations yield a ~50% success rate)

In the screenshots section below, observe that the (implicitly added) last_year_sales and last_year_units values are swapped, causing the yoy_sales_chg and yoy_units_chg to return incorrect values.

For example, lets say that the values stored in the pre-aggregate are as follows:

date_day product_type this_year_sales last_year_sales this_year_units last_year_units
2025-01-21 PET_FOOD 1000 900 10 9

In this example, the expected values are:

  • yoy_sales_chg = 1000 - 900 = 100
  • yoy_units_chg = 10 - 9 = 1

Instead, the following is returned:

  • yoy_sales_chg = 1000 - 9 = 991
  • yoy_units_chg = 10 - 900 = -890

Expected behavior Query results should be consistent and correct.

Screenshots The generated SQL query in the logs looks correct (formatted for readability):

SELECT
    `sales__product_type` `sales__product_type`,
    sum(`sales__this_year_sales`) `sales__this_year_sales`,
    sum(`sales__this_year_units`) `sales__this_year_units`,
    (sum(`sales__this_year_sales`) - sum(`sales__last_year_sales`)) `sales__yoy_sales_chg`,
    (sum(`sales__this_year_units`) - sum(`sales__last_year_units`)) `sales__yoy_units_chg`
FROM cube_pre_aggregations.sales_sales_rollup_20250101_mwnaosch_js50fjs4_1jq1ihq AS `sales__sales_rollup`
WHERE (`sales__product_type` = ?)
  AND (`sales__date_day` >= to_timestamp(?) AND `sales__date_day` <= to_timestamp(?))
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5

However, the following log was observed (formatted for readability) which appears to internally swap the last_year_sales and last_year_units leaf measures. I suspect this is leading to incorrectly calculated yoy_sales_chg and yoy_units_chg values:

2025-01-31T19:28:29.313Z WARN  [cubestore::queryplanner::query_executor] <pid:41 sel1> Slow Partition Query (262.100554ms):
Projection, [
  sales__product_type,
  sales__this_year_sales,
  sales__this_year_units,
  sales__yoy_sales_chg,
  sales__yoy_units_chg
]
  Projection, [
    sales__sales_rollup.sales__product_type:sales__product_type,
    sales__this_year_sales,
    sales__this_year_units,
    SUM(sales__sales_rollup.sales__last_year_sales):SUM(sales__sales_rollup.sales__last_year_units),
    SUM(sales__sales_rollup.sales__last_year_units):SUM(sales__sales_rollup.sales__last_year_sales)
  ]
    ClusterAggregateTopK, limit: 5
      Filter
        Scan sales__sales_rollup, 
          source: CubeTable(index: default:3:[5, 6, 41]),
          fields: [
            sales__product_type, 
            sales__date_day,
            sales__last_year_sales,
            sales__last_year_units,
            sales__this_year_sales,
            sales__this_year_units
          ]

Minimally reproducible Cube Schema

cubes:
  - name: sales
    sql_table: "{{ env_var('CUBEJS_DB_BQ_TABLE_NAME') | safe }}"
    data_source: default
    refresh_key:
      every: 30 minutes

    pre_aggregations:
      - name: sales_rollup
        external: true
        dimensions:
          - date
          - product_type
        measures:
          - this_year_sales
          - last_year_sales
          - this_year_units
          - last_year_units
        time_dimension: date
        granularity: day
        partition_granularity: month
        refresh_key:
          every: 30 minutes
        build_range_start:
          sql: select timestamp('2025-01-17')
        build_range_end:
          sql: select timestamp('2025-01-25')

    dimensions:
      - name: date
        sql: TIMESTAMP(date)
        type: time

      - name: product_type
        sql: product_type
        type: string

      - name: product_name
        sql: product_name
        type: string

      - name: product_id
        sql: product_id
        type: string

    measures:
      - name: this_year_sales
        sql: this_year_sales
        type: sum

      - name: last_year_sales
        sql: last_year_sales
        type: sum

      - name: this_year_units
        sql: this_year_units
        type: sum

      - name: last_year_units
        sql: last_year_units
        type: sum

      - name: yoy_sales_chg
        sql: "{this_year_sales} - {last_year_sales}"
        type: number

      - name: yoy_units_chg
        sql: "{this_year_units} - {last_year_units}"
        type: number

Version: Behavior was observed using Docker images with tags between v1.1.8 and v1.3.20:

  • cubejs/cube:v1.3.20
  • cubejs/cubestore:v1.3.20

This is not reproducible using cubejs/cubestore:v1.1.7, and appears to have been introduced in v1.1.8.

Additional context

  • Running in production mode (1 api, 1 refresh worker, 1 router, 2 workers)
  • Raw data is stored Google BigQuery
  • Pre-aggregations are stored in GCS using export bucket strategy

_edit: retested using latest (1.2.20) on 2025-03-10 and issue is still present _edit: retested using latest (1.3.8) on 2025-04-25 and issue is still present _edit: retested using latest (1.3.20) on 2025-06-09 and issue is still present

Kelvin4702 avatar Feb 05 '25 16:02 Kelvin4702

Thanks for reporting @Kelvin4702. Let us investigate and get back to you.

igorlukanin avatar Feb 27 '25 13:02 igorlukanin

Any news on this ? We have the exact same issue and it's blocking

Rmannn avatar Mar 19 '25 19:03 Rmannn

@Kelvin4702 @Rmannn Could you please try v1.3.7 (latest)?

I've tried to reproduce this on the latest and everything works as expected for me.

Data model that I've used (with DuckDB):

cubes:
  - name: sales
    sql: >
      SELECT '2025-01-18' AS date, 'PET_FOOD' AS product_type, 'B1' AS product_name, 2 AS product_id, 200 AS this_year_sales, 100 AS last_year_sales, 2000 AS this_year_units, 1000 AS last_year_units UNION ALL
      SELECT '2025-01-19' AS date, 'PET_FOOD' AS product_type, 'C1' AS product_name, 3 AS product_id, 300 AS this_year_sales, 150 AS last_year_sales, 3000 AS this_year_units, 1500 AS last_year_units UNION ALL
      SELECT '2025-01-20' AS date, 'PET_FOOD' AS product_type, 'D1' AS product_name, 4 AS product_id, 400 AS this_year_sales, 200 AS last_year_sales, 4000 AS this_year_units, 2000 AS last_year_units
    data_source: default
    refresh_key:
      every: 30 minutes

    pre_aggregations:
      - name: sales_rollup
        external: true
        dimensions:
          - date
          - product_type
        measures:
          - this_year_sales
          - last_year_sales
          - this_year_units
          - last_year_units
        time_dimension: date
        granularity: day
        partition_granularity: month
        refresh_key:
          every: 30 minutes
        build_range_start:
          sql: select '2025-01-17'::TIMESTAMP
        build_range_end:
          sql: select '2025-01-25'::TIMESTAMP

    dimensions:
      - name: date
        sql: "date::TIMESTAMP"
        type: time

      - name: product_type
        sql: product_type
        type: string

      - name: product_name
        sql: product_name
        type: string

      - name: product_id
        sql: product_id
        type: string

    measures:
      - name: this_year_sales
        sql: this_year_sales
        type: sum

      - name: last_year_sales
        sql: last_year_sales
        type: sum

      - name: this_year_units
        sql: this_year_units
        type: sum

      - name: last_year_units
        sql: last_year_units
        type: sum

      - name: yoy_sales_chg
        sql: "{this_year_sales} - {last_year_sales}"
        type: number

      - name: yoy_units_chg
        sql: "{this_year_units} - {last_year_units}"
        type: number

One of many query runs with different LIMIT:

Image

igorlukanin avatar Apr 24 '25 13:04 igorlukanin

Just tried on v1.3.8 -- issue still seems to be occurring.

Here's an excerpt from the cubestore worker logs, where we can see a query with "limit": 14 answer correctly, then another query with "limit": 15 answer incorrectly:

2025-04-25T15:34:32.030Z INFO  [cubestored] <pid:96 sel3> Cube Store version 1.3.8
2025-04-25T15:34:32.361Z INFO  [cubestore::cluster] <pid:3> Running select completed (376.196357ms)
2025-04-25T15:34:32.366Z WARN  [cubestore::queryplanner::query_executor] <pid:3> Slow Query (377.587957ms):
Projection, [sales__product_type, sales__this_year_sales, sales__this_year_units, sales__yoy_sales_chg, sales__yoy_units_chg]
  Projection, [sales__sales_rollup.sales__product_type:sales__product_type, sales__this_year_sales, sales__this_year_units, SUM(sales__sales_rollup.sales__last_year_sales), SUM(sales__sales_rollup.sales__last_year_units)]
    ClusterAggregateTopK, limit: 14
      Filter
        Scan sales__sales_rollup, source: CubeTable(index: default:8:[8, 15]), fields: [sales__product_type, sales__date_day, sales__last_year_sales, sales__last_year_units, sales__this_year_sales, sales__this_year_units]

2025-04-25T15:34:42.392Z INFO  [cubestored] <pid:125 sel4> Cube Store version 1.3.8
2025-04-25T15:34:42.588Z INFO  [cubestore::cluster] <pid:3> Running select completed (233.05261ms)
2025-04-25T15:34:42.589Z WARN  [cubestore::queryplanner::query_executor] <pid:3> Slow Query (233.877108ms):
Projection, [sales__product_type, sales__this_year_sales, sales__this_year_units, sales__yoy_sales_chg, sales__yoy_units_chg]
  Projection, [sales__sales_rollup.sales__product_type:sales__product_type, sales__this_year_sales, sales__this_year_units, SUM(sales__sales_rollup.sales__last_year_sales):SUM(sales__sales_rollup.sales__last_year_units), SUM(sales__sales_rollup.sales__last_year_units):SUM(sales__sales_rollup.sales__last_year_sales)]
    ClusterAggregateTopK, limit: 15
      Filter
        Scan sales__sales_rollup, source: CubeTable(index: default:8:[8, 15]), fields: [sales__product_type, sales__date_day, sales__last_year_sales, sales__last_year_units, sales__this_year_sales, sales__this_year_units]

Kelvin4702 avatar Apr 25 '25 15:04 Kelvin4702

We are also experiencing this same issue on v1.3.8. We have downgraded to 1.1.7 in the meantime and can confirm that this error does not occur in this version.

rsyvarth avatar Apr 27 '25 19:04 rsyvarth

Same,

  • trying 1.3.8 to 1.3.11, issue is still present.

Rmannn avatar Apr 28 '25 12:04 Rmannn

Adding an orderBy to the cube request seems to fix the issue.

Rmannn avatar May 12 '25 14:05 Rmannn

Adding order by also helped me, as a workaround I used query rewrite to add order to the query when it doens't exist, so the user will not need to remember that

guydou avatar Jun 09 '25 11:06 guydou

Unfortunately, adding an order parameter to the REST API request did not remove the issue for the example in the original post :( Tested using the latest version (1.3.20).

Kelvin4702 avatar Jun 09 '25 19:06 Kelvin4702

@Kelvin4702 It worked for us on version 1.3.15 Alternatively you can try to add to your payload all the meassures that constract other measures.

In your case, add:

  • sales.last_year_sales
  • sales.last_year_units

ormahler avatar Jun 13 '25 07:06 ormahler

@igorlukanin To emphasize- this issue occurs when using measures that are composed out of other measures. Since it’s a silent bug, customers may not be aware it affects them. Consider adding warning tooltip next to this part of the documentation with the afected versions.

ormahler avatar Jun 13 '25 07:06 ormahler

Moving to non-partitioned pre-aggregation solved the issue for us, but not always possible

laurent-cobos avatar Jul 30 '25 15:07 laurent-cobos

Finally in our case I think it was because our API and the Refresh Worker were concurrently trying to build our cube partitioned pre-aggregations, resulting in errors and partially built pre-aggs, causing the shift in the columns.

laurent-cobos avatar Jul 30 '25 22:07 laurent-cobos