cube
cube copied to clipboard
Bug with subquery and JOIN operation within filters
A cube crashes when we use field_2 as a filter but works well when we use field_2 as dimensions.
Steps to reproduce the behavior:
- Run a query with filters:
{
"filters": [
{
"member": "model_b.field_2",
"operator": "set"
}
],
"dimensions": [
"model_b.field_2",
"model_b.cube_unique_id"
],
"order": {
"model_b.field_2": "asc"
}
}
and then receive an error:
RangeMaximum call stack size exceeded
- Run without filters:
{
"dimensions": [
"model_b.field_2",
"model_b.cube_unique_id"
],
"order": {
"model_b.field_2": "asc"
}
}
Minimally reproducible Cube Schema:
asyncModule(async () => {
cube('model_a', {
sql: `select * from table`,
dimensions: {
field_1: {
sql: `field_1`,
type: 'string'
},
},
});
cube('model_b', {
sql: `select * from table`,
dimensions: {
cube_unique_id: {
sql: `row_id`,
type: 'string',
primary_key: true
},
field_1: {
sql: `sum(${metric_1}) over ()`,
type: `number`,
sub_query: true,
propagate_filters_to_sub_query: true,
},
field_2: {
sql: `${model_a.field_1}`,
type: 'string'
},
},
measures: {
metric_1: {
sql: `metric_1`,
type: 'max'
}
},
joins: {
model_a: {
relationship: 'one_to_one',
sql: `${CUBE.field_1} = ${model_a.field_1}`
},
},
});
})
Version: 0.35.1
Hi @itestyoy 👋
Well, this is a pretty interesting data model!
- If you comment out or remove
sub_query: trueandpropagate_filters_to_sub_query: true, then theRangeMaximum call stack size exceedederror will be gone. I feel like this should not be a subquery dimension at all because you're referencing the measure from the same cube (model_b.metric_1) rather than from another cube there. model_b.field_1should probably not be a dimension at all, I can see that it's defined assum(${metric_1}) over ()and that looks like an aggregation to me. Aggregations should go to measures, not dimensions.
I hope this will help you fix this data model.
Hi @igorlukanin !
Thank you for the explanation, but
It is not a real model, it's just a simple example demonstrating how to encounter the error without involving all models. If we use metrics from another cube in a subquery, the error will be the same. In our case, we're not using SUM, we're using FIRST_VALUE with a PARTITION BY clause.
My question is not about the model, it's about why it works well with dimensions but throws an error with filters.
Even if this data model is simplified for this example, my point is that it doesn't look sound to me. This is not how subquery dimensions are supposed to be defined:
field_1: {
sql: `sum(${metric_1}) over ()`,
type: `number`,
sub_query: true,
propagate_filters_to_sub_query: true,
},
If you rework that part, the error will be gone.
Let me also tag @paveltiunov in case he'd have an alternative take on whether this is a sound data model or not.
@igorlukanin @paveltiunov Hi!
If you rework that part, the error will be gone.
We reworked this part with an additional cube and we achieve propagate_filters_to_sub_query using ${FILTER_GROUP(FILTER_PARAMS....)}, but it's not a good approach because we need to synchronize FILTER_PARAMS with dimensions.
For me, regardless of whether it looks like a model or not, it seems like a bug. Maybe there are any updates?
@itestyoy The data model that you've initially provided is not correct and it's not supposed to work. You can't have an aggregation in the sql property of a subquery dimension. That is why I don't think there's any bug in Cube here.
@igorlukanin But why it works good as dimension? What is the difference between
select a from b- workselect a from b where a > 1- not work
?
Sorry, I don't understand.
How are select a from b and select a from b where a > 1 related anything that we discussed above?
@igorlukanin
This doesn't work
(1)
{
"filters": [
{
"member": "model_b.field_2",
"operator": "set"
}
],
"dimensions": [
"model_b.field_2",
"model_b.cube_unique_id"
],
"order": {
"model_b.field_2": "asc"
}
}
This works (2)
{
"dimensions": [
"model_b.field_2",
"model_b.cube_unique_id"
],
"order": {
"model_b.field_2": "asc"
}
}
The only difference is filters with the same field as dimension. If using your explanation, query 2 also will not work, but it does.