duckdb-r
duckdb-r copied to clipboard
Construction of deep relational trees
This is a toy example, but relevant for some CRAN packages with the default setting of max_expression_depth
. The symptoms are the same as when evaluating rel7
.
Ideally, we would already see an error when constructing rel5
. However, the system lets me construct rel5
and even rel6
, only construction of rel7
fails with the same error as the evaluation of rel5
. Is this an off-by-two error, or something more serious?
duckplyr can fall back to dplyr if the error happens at construction, but not at evaluation -- this is too late. An error on construction of rel5
or perhaps even rel4
would fix the downstream problem. How to achieve this?
duckdb <- asNamespace("duckdb")
con <- DBI::dbConnect(duckdb::duckdb())
experimental <- FALSE
df1 <- tibble::tibble(id = 1L)
DBI::dbExecute(con, "SET max_expression_depth TO 5")
#> [1] 0
rel1 <- duckdb$rel_from_df(con, df1, experimental = experimental)
rel2 <- duckdb$rel_project(
rel1,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
rel3 <- duckdb$rel_project(
rel2,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
rel4 <- duckdb$rel_project(
rel3,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
rel4
#> DuckDB Relation:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> r_dataframe_scan(0x11cca4278)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - id (INTEGER)
rel5 <- duckdb$rel_project(
rel4,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
rel5
#> DuckDB Relation:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> r_dataframe_scan(0x11cca4278)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - id (INTEGER)
rel6 <- duckdb$rel_project(
rel5,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
rel6
#> DuckDB Relation:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> Projection [id as id]
#> r_dataframe_scan(0x11cca4278)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - id (INTEGER)
rel7 <- duckdb$rel_project(
rel6,
list({
tmp_expr <- duckdb$expr_reference("id")
duckdb$expr_set_alias(tmp_expr, "id")
tmp_expr
})
)
#> Error: {"exception_type":"Binder","exception_message":"Max expression depth limit of 5 exceeded. Use \"SET max_expression_depth TO x\" to increase the maximum expression depth."}
rel7
#> Error in eval(expr, envir, enclos): object 'rel7' not found
duckdb$rel_to_altrep(rel6)
#> Error: Error evaluating duckdb query: Parser Error: Maximum tree depth of 5 exceeded in logical planner
duckdb$rel_to_altrep(rel5)
#> Error: Error evaluating duckdb query: Parser Error: Maximum tree depth of 5 exceeded in logical planner
duckdb$rel_to_altrep(rel4)
#> id
#> 1 1
Created on 2024-03-10 with reprex v2.1.0
See https://github.com/duckdblabs/duckplyr/commit/ffa7e96ac50db7a4d3d0d7f73ef0930337af97df for my workaround.
The necessary margin seems to be larger than 2, even larger than 10. This helped with at least one reverse dependency, we'll see.