cube
cube copied to clipboard
{cube.sql()} is not supported in YAML/Python
Is your feature request related to a problem? Please describe. Extending cubes brings great code reusability. However, it is currently not supported for the Python cubes. Look at these 2 cubes:
# cubes/applications.yml
{% set model = dbt_model('applications') %}
cubes:
- name: "cube_{{ model.name | safe }}"
sql: >
SELECT * FROM {{ model.sql_table | safe }}
public: false
dimensions:
{{ model.as_dimensions() }}
# Model-specific measures
measures:
- name: total_count
type: count
# cubes/applications_sent.yml
cubes:
- name: cube_applications_sent
extends: cube_applications
sql: >
SELECT * FROM { cube_applications.sql() } WHERE applied_at is not null
public: false
According to the documentation, this could work, but it gives me the following error:
Error: Compile errors:
Can't parse python expression. Most likely this type of syntax isn't supported yet: Unsupported Python Trailer children node: TrailerContext: ()
at ErrorReporter.throwIfAny (/cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/ErrorReporter.ts:133:13)
at DataSchemaCompiler.throwIfAnyErrors (/cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/DataSchemaCompiler.js:213:23)
at /cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/DataSchemaCompiler.js:108:16
at CompilerApi.getCompilers (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:66:26)
at CompilerApi.scheduledPreAggregations (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:189:31)
at RefreshScheduler.roundRobinRefreshPreAggregationsQueryIterator (/cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:474:38)
at /cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:595:12
at async Promise.all (index 0)
If I modify the applications_sent cube's SQL to SELECT * FROM { cube_applications.sql } WHERE applied_at is not null then I can see that the cube_applications.sql is injected as a JS arrow function () = > query.cubeSql(cube.cubeName()) – obviously Python cannot evaluate it
Describe the solution you'd like I would like to be able to extend Python cubes the same way as it is possible for the YAML / JS ones
Describe alternatives you've considered
Since I am preparing my data in dbt I can refactor the applications_sent cube to
# cubes/applications_sent.yml
{% set model = dbt_model('applications') %}
cubes:
- name: cube_applications_sent
extends: cube_applications
sql: >
SELECT * FROM {{ model.sql_table | safe }} WHERE applied_at is not null
public: false
^ this will work but feels a bit hacky
Hey @wasd171 👋 Thanks for reporting this! My understanding is that extends works as it should, the real issue is that {cube.sql()} doesn't work in YAML data models. You can try removing that—everything should just work then. However, I agree that this is something that should be implemented
Thanks for a swift answer @igorlukanin
You can try removing that—everything should just work then
But I need some table to select from and to apply my where filter. How could I do it without { cube.sql() }?
As a workaround, I can recommend to put that table name in a variable, then use it in both cubes. In the meantime, we should definitely make {cube.sql()} work.
Any news on {cube.sql()} issue?
@ZiggiZagga Supporting this is on the roadmap, however, there's currently no estimate as to when this might be done.
Hallo @igorlukanin,
I am learning cube and how it works. I have now come across the concept of Data Blending:
I am looking at the example: https://cube.dev/docs/product/data-modeling/concepts/data-blending
cube(`all_sales`, {
sql: `
SELECT
amount,
user_id AS customer_id,
created_at,
'online' AS row_type
FROM (${online_orders.sql()}) AS online --source cube via literal template
UNION ALL
SELECT
amount,
customer_id,
created_at,
'retail' AS row_type
FROM (${retail_orders.sql()}) AS retail --source cube via literal template
`,
});
I was wondering isn't it more intuitiv if we had an addition property called "cube_sql" which works as follows
cube(`all_sales`, {
-- new property "cube_sql"
cube_sql: `
SELECT
amount,
user_id AS customer_id,
created_at,
'online' AS row_type
FROM online_orders AS online --source cube via cube id
UNION ALL
SELECT
amount,
customer_id,
created_at,
'retail' AS row_type
FROM retail_orders AS retail --source cube via cube id
`,
});
this way, as a developer i dont need to call cube.sql() because it will be treated behind the scenes. All I need to do is refer to cubes by their identifiers.
What do you think is this possible?
Hi @ZiggiZagga, thanks for the code example!
Frankly, I don't see it as an improvement over the current situation.
First, it doesn't simplify the definition of data model: instead of {cube.sql()} one would need to write cube_sql. No big win syntax-wise.
Second, it introduces additional ambiguity: without checking which parameter is used, one can never be sure if some_table_or_cube in the SQL is a reference to a SQL table or to a cube.
I also doubt that bare some_table_or_cube references would work reliably at all times: now Cube lets you define cubes with names like from or where; would you like these to be treated as cube references in cube_sql as well?
Commenting to follow for updates, would love to see this feature work for .yml models too. Current workaround is a series of lengthy CTEs