duckdb-r icon indicating copy to clipboard operation
duckdb-r copied to clipboard

MAP type is not working in R

Open nassuphis opened this issue 1 year ago • 5 comments

> con <- dbConnect(duckdb::duckdb())
> dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)

nassuphis avatar Jan 04 '24 00:01 nassuphis

Thanks! Tagging https://github.com/duckdb/duckdb/issues/8859 since the two are similar issues

Tmonster avatar Jan 04 '24 00:01 Tmonster

the histogram(arg) aggregate function does not work for the same reason

 > tbl(con,sql("SELECT * FROM range(100) AS tt(x)")) %>%
+ summarise(h=sql("histogram(x)"))
Error in `collect()`:
! Failed to collect lazy table.
Caused by error:
! rapi_prepare: Unknown column type for prepare: MAP(BIGINT, UBIGINT)
Run `rlang::last_trace()` to see where the error occurred.

nassuphis avatar Jan 04 '24 11:01 nassuphis

Thanks, confirmed.

library(DBI)
con <- dbConnect(duckdb::duckdb())
dbGetQuery(con, "SELECT map([1,2],['a','b']) AS x;")
#> Error: rapi_prepare: Unknown column type for prepare: MAP(INTEGER, VARCHAR)

dbExecute(con, "COPY (SELECT map([1,2],['a','b']) AS x) TO 'map.parquet'")
#> [1] 1
parquet <- arrow::read_parquet("map.parquet")
tibble::as_tibble(parquet)
#> # A tibble: 1 × 1
#>                                                            x
#>   <list<
#>   tbl_df<
#>     key  : integer
#>     value: character
#>   >
#> >>
#> 1                                                    [2 × 2]
parquet$x[[1]]
#> # A tibble: 2 × 2
#>     key value
#>   <int> <chr>
#> 1     1 a    
#> 2     2 b

Created on 2024-04-24 with reprex v2.1.0

For reference, Arrow converts dictionaries to two-column data frames, this is what we should do here too.

Side note: The headers for list-of-data-frame columns look odd. I suspect we need to work around in pillar.

krlmlr avatar Apr 24 '24 07:04 krlmlr

I mentioned this at the old repo but, for posterity, a relatively simple workaround is UNNEST(MAP_ENTRIES(..., recursive := TRUE)). This will coerce the map dictionary into a regular 2-D data.frame that R understands.

Using @krlmlr's example:

library(DBI)
con = dbConnect(duckdb::duckdb(), shutdown = TRUE)
dbGetQuery(
    con,
    "
    FROM (SELECT map([1,2],['a','b']) AS x)
    SELECT UNNEST(MAP_ENTRIES(x), recursive := TRUE)
    "
)
#>   key value
#> 1   1     a
#> 2   2     b

grantmcdermott avatar May 01 '24 21:05 grantmcdermott

PR is here: https://github.com/duckdb/duckdb-r/pull/165

hannes avatar May 08 '24 12:05 hannes