duckdb-r
duckdb-r copied to clipboard
MAP type is not working in R
> con <- dbConnect(duckdb::duckdb())
> dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)
Thanks! Tagging https://github.com/duckdb/duckdb/issues/8859 since the two are similar issues
the histogram(arg) aggregate function does not work for the same reason
> tbl(con,sql("SELECT * FROM range(100) AS tt(x)")) %>%
+ summarise(h=sql("histogram(x)"))
Error in `collect()`:
! Failed to collect lazy table.
Caused by error:
! rapi_prepare: Unknown column type for prepare: MAP(BIGINT, UBIGINT)
Run `rlang::last_trace()` to see where the error occurred.
Thanks, confirmed.
library(DBI)
con <- dbConnect(duckdb::duckdb())
dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
#> Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)
dbExecute(con, "COPY (SELECT map([1,2],['a','b']) AS x) TO 'map.parquet'")
#> [1] 1
parquet <- arrow::read_parquet("map.parquet")
tibble::as_tibble(parquet)
#> # A tibble: 1 × 1
#> x
#> <list<
#> tbl_df<
#> key : integer
#> value: character
#> >
#> >>
#> 1 [2 × 2]
parquet$x[[1]]
#> # A tibble: 2 × 2
#> key value
#> <int> <chr>
#> 1 1 a
#> 2 2 b
Created on 2024-04-24 with reprex v2.1.0
For reference, Arrow converts dictionaries to two-column data frames, this is what we should do here too.
Side note: The headers for list-of-data-frame columns look odd. I suspect we need to work around in pillar.
I mentioned this at the old repo but, for posterity, a relatively simple workaround is UNNEST(MAP_ENTRIES(..., recursive := TRUE)). This will coerce the map dictionary into a regular 2-D data.frame that R understands.
Using @krlmlr's example:
library(DBI)
con = dbConnect(duckdb::duckdb(), shutdown = TRUE)
dbGetQuery(
con,
"
FROM (SELECT map([1,2],['a','b']) AS x)
SELECT UNNEST(MAP_ENTRIES(x), recursive := TRUE)
"
)
#> key value
#> 1 1 a
#> 2 2 b
PR is here: https://github.com/duckdb/duckdb-r/pull/165