cube icon indicating copy to clipboard operation
cube copied to clipboard

Preaggregations are not working from Metabase

Open Betilopeza opened this issue 1 year ago • 7 comments

Hello! we are testing preaggregations to improve the latency for the questions to appear in Metabase dashboards. We are currently using:

  • Cube Cloud Latest (0.35.22)
  • Metabase Cloud v1.49.6
  • Google Cloud (have the prod_pre_aggregations dataset already saving this aggregations I did)

For some reason, when we create a preaggregation with measures that worked perfectly fine in our playground, but when queried from Metabase, are not preaggregated as shown below in the example.

image image image

The dynamic schema for that table is this one:

{% set account = "accountUuid" %}

{% set interaction = "interactionUuid" %}
{% set time = "time" %}
cubes:
  {%- for cube in load_interaction_tags()["cubes"] %}

  - name: {{ cube.name }}
    sql: {{ cube.sql }}
    dataSource: default

  {%- if cube.dimensions is not none and cube.dimensions|length > 0 %}
    pre_aggregations:
      - name: {{cube.name + "_FullTable" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.isDone
          - InteractionLocations.url
          - MGMLocations.Active_Status
          - MGMLocations.HOA_Type
          - MGMLocations.Plan_Type
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_LocationInteraction3" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.url
          - MGMLocations.Active_Status
          - MGMLocations.HOA_Type
          - MGMLocations.Plan_Type
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_LocationInteraction4" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
          - InteractionLocations.createdAt
          - InteractionLocations.url
          - MGMLocations.name
          - MGMLocations.Account_Manager
        refresh_key:
          every: 1 hour

      - name: {{cube.name + "_table" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
        refresh_key:
          every: 1 hour

      {%- for dimension in cube.dimensions %}

      - name: {{ dimension.name + "_filter" }} 
        dimensions:
          - {{ dimension.name }}
        refresh_key:
          every: 1 hour

      - name: {{ cube.name + "_distinctUuid" }} 
        measures:
          - distinctUuid
        dimensions:
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
        refresh_key:
          every: 1 hour

      {%- endfor %}

      - name: {{ cube.name + "_distinctUuid" }} 
        dimensions:
          {%- for dimension in cube.dimensions %} 
          {%- if dimension.name != account and  dimension.name != interaction %}
          - {{ dimension.name }}
          {%- endif %}
          {%- endfor %}
        measures:
          - distinctUuid
        refresh_key:
          every: 1 hour

      - name: {{ cube.name + "_distinct" }} 
        measures:
          - distinctUuid
  {%- endif %}

  {%- if cube.joins is not none and cube.joins|length > 0 %}
    joins:
      {%- for join in cube.joins %}
      - name: {{ join.name }}
        relationship: {{ join.relationship }}
        sql: {{ join.sql }}
      {%- endfor %}
  {%- endif %}

  {%- if cube.measures is not none and cube.measures|length > 0 %}
    measures:
      {%- for measure in cube.measures %}
      - name: {{ measure.name }}
        type: {{ measure.type }}
      {%- if measure.sql %}
        sql: {{ measure.sql }}
      {%- endif %}
      {%- endfor %}
  {%- endif %}

  {%- if cube.dimensions is not none and cube.dimensions|length > 0 %}
    dimensions:
      {%- for dimension in cube.dimensions %}
      - name: {{ dimension.name }}
        sql: {{ dimension.sql }}
        type: {{ dimension.type }}
        {% if dimension.primaryKey == True -%}
        primaryKey: true
        public: true
        {% endif -%}
      {%- endfor %}
  {%- endif %}
  {%- endfor %}

And this is our cube.js

const {memoizedFetchAccountPassword} = require("./sql-auth")

function decodeBase64(data) {
    let buff = Buffer.from(data, 'base64');
    return buff.toString('ascii');
}

const deconstructGlobalId = (globalId) => {
    const decoded = decodeBase64(globalId)
    const globalIdParts = decoded.split(":")

    return {
        node: globalIdParts[0],
        uuid: globalIdParts[1]
    }
}

const contextToAppId = (context) => `CUBEJS_APP_${context.cacct}`

const extendContext = (req) => {
  // For SQL API (context is then handled by checkSqlAuth)
    if (req.headers === undefined || req.headers.cacct === undefined) {
        return;
    }

    try {
        const deconstructed = deconstructGlobalId(req.headers.cacct);
        return {cacct: deconstructed["uuid"]};
    } catch (err) {
        console.log("Error extending context: " + err)
    }
}

const queryRewrite = (query, request) => {
    console.log("Rewriting for", JSON.stringify(request))
    const accountId = request.cacct !== undefined ? request.cacct : request.securityContext.cacct;
    
    if (query["dimensions"].length > 0) {
        try {
            var table_name = query["dimensions"][0].substring(0, query["dimensions"][0].indexOf("."));
            } catch {
            var table_name = query["dimensions"][0]["cubeName"]
            }   
    }
    // measures in query
    else if (query["measures"].length > 0) {
        try {
            var table_name = query["measures"][0].substring(0, query["measures"][0].indexOf("."))
            } catch {
            var table_name = query["measures"][0]["cubeName"]
            }
    }

    else if (query["timeDimensions"].length > 0) {
        try {
        var table_name = query["timeDimensions"][0]["dimension"].substring(0, query["timeDimensions"][0]["dimension"].indexOf("."))
        } catch {
            var table_name = query["measures"][0]["cubeName"]
        }
    }

    else {
        console.log("Using account id:", accountId)
        console.log("The query:", query)
    }

    console.log("Using account id:", accountId);

    query.filters.push({
        member: `${table_name}.accountUuid`,
        operator: 'equals',
        values: [accountId],
    });

    return query;
}

const checkSqlAuth = async (req, username) => {
    // Remove this section for containing sensitive information
}

const queueOptions = {
  concurrency: 2,
  executionTimeout: 600,
  orphanedTimeout: 120,
  heartBeatInterval: 120,
};

module.exports = {
    http: {
        cors: {
            origin: '*',
            methods: 'GET,HEAD,PUT,PATCH,POST,DELETE',
            preflightContinue: false,
            allowedHeaders: ['Content-Type', 'Authorization', 'cacct'],
            optionsSuccessStatus: 204,
        },
    },
    scheduledRefreshTimer: 120,
    checkSqlAuth,
    contextToAppId,
    extendContext,
    queryRewrite,
    orchestratorOptions: {
        queryCacheOptions: {
            refreshKeyRenewalThreshold: 120,
            backgroundRenew: true,
            queueOptions,
        },
        preAggregationsOptions: {queueOptions},
    },
    // Figure out ScheduledRefreshContexts later! (probably needed for QueryRewrite)
    // Placeholder to prevent the error message:
    scheduledRefreshContexts: () => [
        {
            securityContext: {
                cacct: '00000000-0000-0000-0000-000000000000'
            }
        }
    ],
};

Please if you can check if we are doing something wrong, or if we are missing something, would be really helpful

Betilopeza avatar Apr 30 '24 19:04 Betilopeza

@Betilopeza It seems distinctUuid is a measure however you're using it in GROUP BY. I expect to see it inside COUNT(DISTINCT distinctUuid) aggregation reference. Could you please elaborate on your use case?

paveltiunov avatar May 01 '24 03:05 paveltiunov

Hello, yes we tried in the group by, because in the measure sections the preaggregation was not working either! I send you the screenshots. Preaggregation: - name: {{ cube.name + "_distinct" }} measures: - distinctUuid

image (21) image (22) image (23)

Betilopeza avatar May 02 '24 15:05 Betilopeza

@Betilopeza There're dimensions in this pre-aggregation and this pre-aggregation is non-additive. So it couldn't match a query without dimensions defined. Please see https://cube.dev/docs/product/caching/matching-pre-aggregations.

paveltiunov avatar May 02 '24 22:05 paveltiunov

We are trying to make work this preaggregation: Preaggregation:

  • name: {{ cube.name + "_distinct" }} measures:
  • distinctUuid

As you can see it actually doesn´t have any dimension, so it´s happenning what I send you in the previous message. In playground works perfectly, but not in metabase. I did't get why you said that we have dimensions

Betilopeza avatar May 03 '24 18:05 Betilopeza

@Betilopeza Unfortunately we can't reproduce it on our side. If you're in Cube Cloud please prepare a branch with a description on how to reproduce and file a ticket with the support team. We'll take a look.

paveltiunov avatar May 03 '24 23:05 paveltiunov

Thanks! Unfortunately, we haven't been able to reach support since February, so I don't think that would be an option.

Betilopeza avatar May 06 '24 19:05 Betilopeza

Hi @Betilopeza, you can send the link to your branch with a reproduction and any relevant information to me via [email protected].

igorlukanin avatar May 13 '24 11:05 igorlukanin